Scientific Networking Summer Student Assistant
ESnet’s mission is to accelerate science by delivering unparalleled networking capabilities, tools, and innovations. ESnet interconnects the US National Laboratory system, is widely regarded as a technical pioneer, and is currently the fastest science network in the world. We are working at the leading edge of software-defined networking, network knowledge plane, dynamic network infrastructure, network visualization, multi-domain and multi-layer architectures, deep learning, etc. Opportunities exist in the organization to support: Research, Software, IT Technology, Data analysis, Financial, Operational Improvements, Project Management Tools, Reports and Communications etc.
To learn more about ESnet’s internship program, visit https://www.es.net/about/careers/student-internships/
ESnet is hiring Student Assistants for the following 12 projects. As part of the application process, please submit a statement listing which 1-3 projects you are interested in and why. You may be contacted by more than one project team. All projects are for Summer 2025 and are hybrid onsite/remote unless noted otherwise.
Project 1: Linux CLAT
Required Skills: CS, Computer networks, Linux
Abstract:
Linux has limited options surrounding automatic transition to IPv6-only operation. The most common transition mechanism on other computing platforms is called 464XLAT, and is defined by RFC6877. The mechanism in which a host implements their portion of this toolset is called CLAT. The student will work on:
- Survey Linux CLAT options
- Test existing implementations, document usability, supportability
- Document any limitations
- Create a set of performance tests comparing against Windows 11 and MacOS CLAT implementations
- Outline potential options for improvement of existing CLAT implementations under linux
- Create an academic paper for publishing results of study
Project 2: Equipment and Topology Discovery
Required Skills: Java, Spring, Maven, REST, Kafka, Docker, Linux/MacOS, K8s
Abstract:
ESnet’s network is growing from iteration to iteration. With the growing number of equipment it is also getting more complicated to keep track of the overall deployed equipment, changes, configurations, status, topology, etc. The Software Engineering Group is working on an innovative microservice based solution to track all this information and provide a central application interface and repository for it. Moreover, data transformation services are frequently processing configuration update events and transform vendor specific data into normalized representations similar to the e.g. Network Markup Language approach.The student will work on development and the evaluation of new data sources, technologies, normalization models, and applications for the Equipment and Topology Discovery service.
Project 3: SENSE Multi-Domain Network Automation, Application Workflow and Performance Evaluation
Term: Spring/Summer 2025
Required Skills: Computer systems, networks, Linux, Python, Java/JavaScript
Abstract:
The SENSE project (http://sense.es.net) is building smart network services to accelerate scientific discovery in the era of big data driven by Exascale, cloud computing, machine learning and AI. The project's architecture, models, and demonstrated prototype define the mechanisms needed to dynamically build end-to-end virtual guaranteed networks across administrative domains, with no manual intervention. The student will work on:
- Python API client and application workflow automation
- SENSE Orchestrator Web Portal UI and monitoring data dashboards
- Network and data movement performance measurement and analysis (graduate student preferred)
Project 4: Application of AI/ML techniques to network capacity planning and traffic matrix prediction
Required Skills: AI/ML (GNN, transformer, RNN etc.), PyTorch, Linux
Abstract:
ESnet moves hundreds of petabytes of data each month (https://my.es.net/traffic-volume). We are looking to gain insight into how our users are utilizing the network, and realize trends that will allow us to make predictions. This project aims to build an AI/ML based traffic forecaster to predict near and longer term traffic patterns. The challenge of scientific workloads is that the network patterns they create tend to be highly volatile data patterns, motivating a novel AI/ML approach that captures the spatio-temporal dynamics of the traffic data. The student will work on:
- Prototype an AI/ML model to forecast near-term and longer-term traffic matrix
- Backtest the prototype model performance with historical data
- Stretch goal: Integrate with the current software stack if time permits
Project 5: Reed Solomon Forward Error Correction for lost packets
Required Skills: understand galois field arithmetic, data structures and algorithms, to create a high speed compute intensive Reed Solomon encoder and decoder.
Abstract:
ESnet is developing a high speed streaming protocol based on IP protocols for connecting real time instruments to supercomputers. This protocol requires forward error correction (FEC) to compensate for lost packets in the network. The FEC needs to be co-designed with the packet protocol, to provide adequate coverage for the types of losses we see in a real network. The student will learn how the IP protocol works, and creates errors, as well as how reed solomon decoders operate, and ways to modify the standard algorithms to tune the solution to this unique problem.
Project 6: Machine Learning for Identification of REE-CM Hot Zones
Required Skills: Machine Learning
Abstract:
Characterization of Rare Earth Elements and Critical Minerals (REE-CM) in unconventional and secondary sources is a complex task that needs to overcome the challenges of detecting low and variable concentrations and the uniqueness of every source material deposit in terms of composition, host material, and disposal environment. We propose a machine learning (ML)-aided multi-physics approach for rapid identification and characterization of REE-CM hot zones in mine tailings for efficient recovery with a focus on coal and sulfide mine tailings and other processing or utilization byproducts, such as fly ash and refuse deposits. This multi-physics approach integrates a range of geophysical, radiological, and optical technologies deployed on aerial and surface platforms suitable for REE-CM prospecting. This approach provides a cross scale capability from whole tailing REE-CM hot zone identification to mineralogical and REE-CM characterization and quantification. Advanced ML capabilities are key to integrate these multi-physics datasets for identifying hot zones and optimizing sensing technology deployment. Feature engineering ML jointed with federated learning and transfer learning will be used for data organization, feature extraction and privacy protection.
Project 7: Time Series Analysis of Network Utilization for Distributed Scientific Workflows
Location: Onsite preferred
Required Skills: python, machine learning, data analysis
Abstract:
Large-scale scientific projects and simulations generate massive amounts of data, which are then transferred to scientific clusters for analysis. This process involves thousands of concurrent data movements and accesses, resulting in redundant transfers and increased network traffic. A regional caching strategy can mitigate this issue by sharing data among users and sites, reducing latency, and improving overall application performance. This project aims to:
1. Explore various time series analysis methods to develop a robust model for network utilization data.
2. Design multi-variate, multi-step time series mechanisms, including machine learning approaches, to capture complex patterns and trends.
3. Discover insights about network utilization trends, enabling data-driven decisions to optimize caching strategies and improve network performance.
Project 8: Analyzing Dataset Popularity for Optimizing In-network Storage
Location: Onsite preferred
Required Skills: python, machine learning, data analysis
Abstract:
Scientific computing has seen a surge in large data transfers. Many of these transfers are redundant, with users repeatedly accessing the same data files for debugging or collaborating on related research topics. To mitigate this, regional data caches have been designed to reduce network traffic and latency, ultimately improving application performance. This project aims to investigate the popularity of datasets in regional data caches, with a focus on determining the predictability of data access patterns. We seek to inform the development of caching policies that can optimize network utilization. Our goal is to answer key questions such as: Which datasets are most frequently accessed? Can we identify patterns in data access behavior? And how can we leverage this knowledge to improve caching strategies?
Project 9: Predicting Laser Component Failures in Network Routers for Proactive Maintenance
Location: Onsite preferred
Required Skills: python, machine learning, data analysis
Abstract:
High-speed network routers and switches rely on lasers for data transmission, and unexpected failures can have significant impacts on network connectivity. We aim to develop a predictive maintenance tool that can identify potential laser component failures before they occur. Using digital monitoring data from ESnet core routers, this project will apply analysis algorithms for feature extraction and failure prediction. Our goal is to identify a reliable algorithm that can predict failure events with sufficient lead time, enabling proactive maintenance and minimizing network disruptions. Initial evaluation of the existing monitoring data has already revealed opportunities for improving the data collection process, and this project will continue to explore and refine the dataset. By developing a usable tool for the network operations center, we can validate the effectiveness of our approach and improve network reliability.
Project 10: Developing Packets: ESnet’s new Design System
Location: Remote Only
Required Skills:
- Visual and communication design principles including typography, iconography, color theory, and grid systems with experience crafting for multiple web platforms.
- Figma, Sketch, Adobe Creative Suite, or comparable software.
- Javascript
Abstract:
ESnet is developing its design system to be used across its product portfolio. A design system is a pattern for handling the common look and feel of user interfaces at scale. They enable organizations to develop frontend solutions at a higher velocity with a more consistent output. The scope of this project is to assist in designing components by delivering low-fidelity wireframes and high-fidelity visual comps. The ideal candidate will also assist with crafting the documentation for these new components and assist with hand off to developers. Could also include development of React components if skillset allows.
Project 11: Wireless Networking Field Measurement
Required Skills: Python, Raspberry Pi
Abstract:
ESnet is increasing our support to field science by deploying a variety of wireless technologies that will extend our data network services to locations where we cannot deploy fiber optics. We need to test and document how we build and operate small board (raspberry pi) systems to provide bandwidth and wireless link data, so that we can troubleshoot and support user connections. The student will assist ESnet and environmental researchers in making, setting up, and testing ruggedized raspberry pi units deployed using FLOTO (an edge device management tool developed by University of Chicago) and perfSonar (a widely used network connection measurement tool). This project supports the ESnet Greenfield Wireless Edge program and will provide a mixture of hands-on and coding related tasks as we develop ways to better support earth and environmental scientists.
Project 12: Leveraging AI/ML and data analytics to improve System Observability
Required Skills:
- Machine Learning, Python / Jupyter notebooks, related libraries in python - scikit-learn, pandas, numpy, NoSQL databases
Abstract:
The Measurement and Analysis group manages Stardust—a distributed network telemetry collection system. The student will work with diverse data sets, including system logs, incident reports, and other telemetry data. You’ll apply machine learning and forecasting techniques to uncover valuable insights that drive data-driven decision-making and improve our operational efficiency.
- Data Mining: Preprocess large volumes of system and incident data for insightful analysis.
- Analysis: Conduct exploratory data analysis to spot exciting patterns, trends, and anomalies.
- Machine Learning: Utilize existing machine learning models to analyze data and make predictions based on historical trends.
- Forecasting: Leverage statistical techniques and machine learning algorithms to develop forecasting models that optimize system performance and network behavior.
- Collaboration: Team up with our amazing group to understand data needs and contribute to the development of innovative, data-driven solutions.
Notes:
- Students must be enrolled in a full-time academic program at an accredited college or university. Proof of enrollment is required.
- Spring 2025 Term is 16 weeks (1/13/2025 - 5/2/2025). Summer is 12 weeks (6/2/2025 - 8/22/2025). Student participation requires 20 hours per week for Spring/Fall, and 40 hours per week for Summer appointments. A "late start" date can be considered for academic reasons.
- Work will be primarily performed in Berkeley, CA, or the Champaign, Bloomington, Illinois office, or remotely.
- The appointment can be renewed based on satisfactory job performance and operational needs.
- Salary will be predetermined based on student step rates.
- Positions may be subject to a background check. Any convictions will be evaluated to determine if they directly relate to the responsibilities and requirements of the position. Having a conviction history will not automatically disqualify an applicant from being considered for employment.
Want to learn more about working at Berkeley Lab? Please visit: careers.lbl.gov
Berkeley Lab is committed to inclusion, diversity, equity and accessibility and strives to continue building community with these shared values and commitments. Berkeley Lab is an Equal Opportunity and Affirmative Action Employer. We heartily welcome applications from women, minorities, veterans, and all who would contribute to the Lab's mission of leading scientific discovery, inclusion, and professionalism. In support of our diverse global community, all qualified applicants will be considered for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, or protected veteran status.
Equal Opportunity and IDEA Information Links:
Know your rights, click here for the supplement: Equal Employment Opportunity is the Law and the Pay Transparency Nondiscrimination Provision under 41 CFR 60-1.4.