Senior Software Engineer/SRE - Artificial Intelligence Group
Location: New York, New York
Type: Full Time
Internal Number: 20137083
The Team: The AI Group is the central engineering group responsible for driving Machine Learning adoption at Bloomberg, with over 200 researchers and engineers working together to provide clients with the best-in-class news, research, market data, and analytics using innovative machine learning technology. We directly impact a wide variety of our flagship products, including news, research, pricing, communications platforms, search and discovery tools. We work on a variety of ML fields, including natural language processing, information retrieval, time series analysis, and recommender systems.
Some projects where we are looking for experienced software engineers include: unified search, question answering, query parsing, financial instrument (e.g., fixed income) pricing, dialogue understanding. Our engineers are responsible for architecting and implementing services end-to-end, overcoming unique challenges that come with machine learning systems in the financial domain. In addition, we contribute to open source; contributions we have made and work with include Solr, Koan, KFServing, Cloud Native Buildpacks and PyTorch Lightning that directly impact our production services.
Broadly, we are looking for colleagues who are passionate about software engineering and who want to learn more about:
Parallel and distributed systems like Kubernetes, OpenMPI, Apache Kafka,
Applied Machine Learning frameworks like PyTorch, scikit-learn, TensorFlow, and
Data-driven frameworks like Apache Spark, Apache Solr, Pandas, Apache Hadoop.
If all of this sounds like the projects you are passionate about and want to work on, apply! Do check out our blog at https://TechAtBloomberg.com/AI and learn more about our projects and research.
The Role: As a Senior engineer in the AI Group, you will have the opportunity to make key technical decisions which help define the future of infrastructure for LLM training and inference at Bloomberg! You will apply your existing engineering experience while gaining new experience in Kubernetes, containerization, GPUs, Data Science, and large distributed ML workloads.
We'll trust you to:
Provide our AI Research teams with self-serving tools to provision and maintain HPC
K8S-based infrastructure in on-prem data centers as well as in public clouds.
Design and implement scaling policies and hardware selection for HPC infrastructure based on business and technical SLAs
Implement and maintain containerized runtimes for training and inference workloads, benchmark and further optimize these runtimes
Use tools and technologies like Terraform to provision and manage infrastructure in a repeatable and automated manner
Develop and use monitoring solutions for training jobs and inference endpoints, implement alerting and automated remediation response
You'll need to have:
4+ years of programming experience with an object-oriented programming language (Python)
A degree in Computer Science, Engineering or similar field of study or equivalent work experience
Proven experience with Unix, Unix tools and shell scripting
Experience managing infrastructure in Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP)
Proficiency with Infrastructure as code technologies such as Terraform, CDK and AWS Cloud Formation
Proficiency with Kubernetes and related technologies such as Helm, Kustomize, Kubectl, Eksctl, Calico and Kyverno
We'd love to see:
Knowledge of LLVM and CUDA ecosystems, Linux packaging
Solid understanding of networking (Infinband, AWS EFA, RoCE)
Familiarity with Deep Learning Frameworks (PyTorch, TensorFlow) and overall software stack
Familiarity with GPUs (NVIDIA chips) and ASICs in general
Bloomberg is an equal opportunity employer, and we value diversity at our company. We do not discriminate on the basis of age, ancestry, color, gender identity or expression, genetic predisposition or carrier status, marital status, national or ethnic origin, race, religion or belief, sex, sexual orientation, sexual and other reproductive health decisions, parental or caring status, physical or mental disability, pregnancy or parental leave, protected veteran status, status as a victim of domestic violence, or any other classification protected by applicable law.
Bloomberg provides reasonable adjustment/accommodation to qualified individuals with disabilities. Please tell us if you require a reasonable adjustment/accommodation to apply for a job or to perform your job. Examples of reasonable adjustment/accommodation include but are not limited to making a change to the application process or work procedures, providing documents in an alternate format, using a sign language interpreter, or using specialized equipment. If you would prefer to discuss this confidentially, please email AMER_recruit@bloomberg.net (Americas), EMEA_recruit@bloomberg.net (Europe, the Middle East and Africa), or APAC_recruit@bloomberg.net (Asia-Pacific), based on the region you are submitting an application for.
Salary Range: 160,000 - 240,000 USD Annually + Benefits + Bonus The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level. We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation [Exempt roles only], paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.