At DataRobot we’ve developed a powerful machine learning application, underpinned by a flexible platform that runs in many environments, from customer hardware to the cloud. We are seeking talented engineers with a strong background in Linux, programming, and distributed computing to help build core services and components that make up the DataRobot stack and deliver them to the most challenging customer environments.
The Platform Engineering team makes it possible to add any service to the DataRobot application, build it, test it, configure it, and ship it to our cloud environment and our on-premise customers. We manage the entire dependency stack, optimizing our product for security, reliability, and performance, as well as extreme portability. We make it possible for DataRobot to run anywhere from single VM's in the cloud to large, high-performance computing clusters, conforming to stringent security protocols for sensitive data.
In particular you will:
- Build and improve DataRobot installation and administration tools.
- Automate the creation of infrastructure for our development, test, and production environments.
- Integrate new services and architectural components into the DataRobot application stack.
- Improve core systems and services related to logging, monitoring, security, and administration.
We’re a fast-paced team with a commitment to quality software. You need to be willing to learn whatever it takes to get the job done, from diving into Linux internals, to programming your own services and libraries in Python, to orchestrating Hadoop environments in the cloud with tools like Terraform and Ansible.
- Programming skills: qualified candidates will have
- Experience developing applications or tools using Python, Java, or similar (Python strongly preferred).
- Comfort writing unit and functional tests of their code.
- Ability to create distributable packages using setup tools, Maven, or similar.
- Experience using CI/CD systems to test, deploy, and ship their code.
- Linux skills: qualified candidates will have
- Comfort with the Bash CLI and scripting.
- Familiarity with multiple Linux distributions.
- Strong troubleshooting skills: finding and parsing logs, inspecting system status, and working in distributed systems.
- Operational skills: excellent candidates will have
- Experience with cloud infrastructure providers (AWS strongly preferred).
- Experience building and maintaining production environments and services.
- Experience with the full lifecycle of software development, from development to productionisation.
- Soft skills: candidates will be evaluated on
- English language, written and verbal communication.
- Ability to work within teams.
- Ability to drive a project from ideation to completion.
- Experience with Continuous Integration and Continuous Delivery – developing automation for build, test, deployment, and release processes
- Configuration management tools like Ansible, Puppet, or SaltStack
- Container orchestration technologies like Mesos, Kubernetes, and Docker Swarm.
- Knowledge of server provisioning systems such as Terraform, Packer, or Cobbler
- Experience administering or using Hadoop.
- Experience delivering software to on-premise environments.
- Expertise in Linux packaging and build tools.