Bio Data Engineer
- 📁
- Information Technology
- 💼
- EB-Environ Genomics & Systems Bio
- 📅
- 103168 Requisition #
Berkeley Lab’s (LBNL) Environmental Genomics and Systems Biology (EGSB) Division is looking for a Bio Data Engineer to join the US Department of Energy’s (DOE) Systems Biology Knowledgebase (KBase) team!
In this exciting role, you will provide support to a Senior KBase Developer in defining Extract, Transforming, and Loading (ETL) processes and pipelines to convert biological data files and metadata into a structured, standardized biological data model. This work is critically important to prepare biological data to be comparable, discovered, accessed, and organized enabling future machine learning/artificial intelligence (ML/AI) efforts to generate accurate and meaningful insights into biological and environmental interactions and processes.
This role will work closely with the KBase User Engagement Lead, who liaises with a community of stakeholders, including Research Scientists, Data Scientists, Analysts, other engineers, and our DOE Program Managers. A rapid design-build-test-learn (DBTL) process ensures our biological data model is accurate, robust, rapidly extensible, and interoperable with existing systems across the Biological and Environmental Research Program (BER) landscape.
This position has an anticipated start date of March 1, 2025.
What You Will Do:
- Support the design, development, and maintenance of scalable Extract, Transforming, and Loading (ETL) processes that support integration and processing of biological data sets from multiple sources.
- Assist the Senior KBase Developer with the implementation and debugging of all relevant software to support efforts.
- Collaborate with scientists, subject matter experts (SME), and analysts to understand their data requirements and deliver solutions that enhance data accessibility and performance.
- Rapidly learn and understand technologies related to databases, schema design and formalization, and ETL processes, including JSONschema, Frictionless (https://frictionlessdata.io/), and LinkML (https://linkml.io).
- Assist with monitoring, troubleshooting, and resolving problems within ETL pipelines.
- Ensure data quality and integrity across the entire data lifecycle.
In addition to the above, the Bio Data Engineer 2 will also:
- Partner with the Senior KBase Developer on schema development and designs.
- Stay abreast of industry trends and best practices in Big Data processing and Data Engineering.
- Maintain a thorough understanding of biological science data types, ontologies, and data sources.
- Independently manage pipelines and workflows, and maintain an understanding of the broader data infrastructure.
What is Required:
- A Bachelor’s Degree (or equivalent knowledge/training) in Computer Science, Engineering, or a related field a minimum of 2 years of relevant work experience in Data Engineering or an equivalent combination of education and experience.
- Experience working with databases, large datasets, and Python, including API access, data transformation and analysis packages.
- Experience with Git and other version control tools.
- Excellent oral and written communication skills including the ability to organize and present information to technical and non technical audiences.
- Strong analytical skills including the ability to identify and solve complex technical problems.
- Demonstrated interpersonal skills including the ability to conduct and perform collaborative work effectively within an interdisciplinary team environment.
Additional Qualifications for the Bio Data Engineer 2:
- A Bachelor’s Degree (or equivalent knowledge/training) in Computer Science, Engineering, or a related field a minimum of 5 years of relevant work experience in Data Engineering or an equivalent combination of education and experience.
- Experience working with Extract, Transforming, and Loading (ETL) processes.
- Experience with containerization and deployment technologies (e.g., Docker).
- Experience with data storage and transfer protocols, including Globus and S3.
- Experience with peer code review, code linting, QC, automated testing, CI/CD, and agile development.
Desired Qualifications:
- Experience working with software development tools and practices in the biological domain.
- Experience with ontologies, controlled vocabularies, and their applications.
Notes:
- Application Deadline: For full consideration, please apply with a resume by January 15, 2025.
- Appointment Type: This is a full time, exempt from overtime pay (monthly paid), 2 year (benefits eligible), Term appointment with the possibility of extension or conversion to Career appointment based upon satisfactory job performance, continuing availability of funds, and ongoing operational needs.
- Salary Information: It is not typical for an individual to be offered a salary at or near the top of the range for a position. Salary will be commensurate with the final candidate’s qualification and experience, including skills, knowledge, relevant education, certifications, and aligned with the internal peer group.
- Level 1: This position is expected to pay $86,628 - $108,276 per year for job code C70.1.
- Level 2: This position is expected to pay $109,152 - $136,428 per year for job code C70.2.
- Background Check: This position may be subject to a background check. Any convictions will be evaluated to determine if they directly relate to the responsibilities and requirements of the position. Having a conviction history will not automatically disqualify an applicant from being considered for employment.
- Work Modality: This position is eligible for onsite, hybrid, or remote work. Remote workers are defined as individuals that reside within the United States, but 150 miles away from Berkeley Lab. Work schedules are dependent on business needs and may be required to be performed during traditional business hours within pacific standard time (PST). There may be an expectation to intermittently conduct work, attend meetings, and train on site at Lawrence Berkeley National Lab located at 1 Cyclotron Road, Berkeley, CA 94720.
- This position is not eligible for relocation assistance.
- Eligibility: This position is not eligible for visa sponsorship (e.g., H-1B, TN, STEM OPT, etc.).
Learn About Us:
Berkeley Lab (LBNL) addresses the world’s most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab’s scientific expertise has been recognized with 16 Nobel prizes. The University of California manages Berkeley Lab for the U.S. Department of Energy’s Office of Science.
Working at Berkeley Lab has many rewards including a competitive compensation program, excellent health and welfare programs, a retirement program that is second to none, and outstanding development opportunities. To view information about the many rewards that are offered at Berkeley Lab- Click Here.
Berkeley Lab is committed to Inclusion, Diversity, Equity and Accountability (IDEA) and strives to continue building community with these shared values and commitments.
Berkeley Lab is an Equal Opportunity and Affirmative Action Employer. We heartily welcome applications from women, minorities, veterans, and all who would contribute to the Lab’s mission of leading scientific discovery, inclusion, and professionalism. In support of our diverse global community, all qualified applicants will be considered for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, or protected veteran status.
Equal Opportunity and IDEA Information Links:
Know your rights, click here for the supplement: "Equal Employment Opportunity is the Law" and the Pay Transparency Nondiscrimination Provision under 41 CFR 60-1.4.