The mission of the Open Learning department is to transform teaching and learning at MIT and around the globe through the innovative use of digital technologies, and the engineering team is critical to the successful execution of that mission. The platform engineering group is responsible for building and maintaining the infrastructure and automation that lets us deliver these learning opportunities and experiences. This includes the Open Courseware platform, multiple deployments of Open edX that support students on-campus and worldwide, and our growing set of Python/Django applications.
As a DevOps/Site Reliability Engineer at MIT Open Learning, you will be collaborating with internal application engineers, the open source community, and product stakeholders. You will work across infrastructure automation, system security, developer tooling, application delivery, and platform observability to ensure that hundreds of thousands of learners across the globe are able to reliably access the educational opportunities that will improve their lives and their communities.
- Deployment and maintenance of infrastructure that powers MIT OpenCourseWare (https://ocw.mit.edu/)
- Deployment, maintenance, and hosting of MITx, the MIT instance of Open edX (https://open.edx.org) used for residential teaching and learning.
- Work in collaboration with the edX team and Open edX community to improve the Open edX platform
- Developer support for the Open Learning Engineering team to enhance the open source edX platform and develop other scalable applications with flexible APIs that power learner experience across the institute and around the world
- Application support, release engineering, systems administration in a 24x7 environment
- Advance best practices with the engineering team through participation in architecture, technical design and code reviews
- Work closely with, brainstorm ideas and incorporate feedback from, the engineering team, MIT community of developers, teachers and learners, Open edX community and open source community at large.
- Use modern infrastructure tools and platforms (e.g. Pulumi, Consul, Vault) to automate AWS cloud environments
- Seven years’ related experience; Familiarity with network design and troubleshooting (DNS, TCP/UDP, IP routing)
- Strong knowledge of UNIX/Linux, especially in virtualized environments such as AWS, Google Cloud, Azure, or Heroku
- Experience with systems configuration management and provisioning tools such as Ansible, SaltStack, Chef or Puppet
- Operational experience with relational and document databases
- Solid familiarity with source code control systems such as Git or Mercurial
- Monitoring and logging systems, using tools like Datadog or Newrelic and ELK or Splunk
- Experience supporting developers and development environments
- Ability to work effectively with both technical and non-engineering personnel
- Experience with Python and the Django framework
- Experience with the configuration of web server software including Nginx, RabbitMQ, Gunicorn, and Elasticsearch
- Experience with continuous integration and testing via tools like Concourse and GitHub Actions
- Experience working with containerization technologies like Docker
- Interest in designing and adapting software to be twelve factor applications
- Knowledge of best practices regarding infrastructure and application security in a cloud environment
- Experience working with transactional email platforms such as Mailgun or Sendgrid
MIT is an equal employment opportunity employer. All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, or disability.