Background
Interacting with existing astronomical datasets can transform how students learn astronomy while developing valuable data science skills for the larger world. Data can come from large-scale surveys, space-based observatories, individual scientists, or students. Students can learn to select, reduce, visualize, and interpret authentic astronomical data while applying data science techniques to construct astronomy knowledge. Many free web-based tools are available for teachers to leverage when integrating data science pedagogy into the classroom. This page discusses how to use the Infrared Science Archive (IRSA), the Sloan Digital Sky Survey (SDSS), and other publicly available data archives in data science contexts in the astronomy classroom. Several classroom-tested activities are linked below. Some activities leverage computer programming, while others involve no programming at all. All activities are freely available, so educators can expand and interpret the activities as needed. This page discusses how these activities connect data science pedagogy and astronomy concepts in a typical astro 101 course.
Data-centric Classroom Activities Described Below
- Kepler’s 3rd Law Spreadsheet activity
- Kepler’s 3rd Law Desmos activity
- Kepler’s 3rd Law Python activity
- Skynet Robotic Telescope & IRSA Nebula activity
- Star Cluster Color-magnitude Diagram with Gaia activity
- 3-Color Astronomical Image Construction with IRSA activity
- Python Hertzsprung-Russell Diagram Introduction activity
- Python Relative Abundance using Equivalent Width activity teacher version (student version)
- Python Hubble-Lemaître’s Law using SDSS BOSS plate activity teacher version (student version)
- Bonus activity: Python Aperture Photometry Cluster distance activity
Using Archival Data alongside Computational Thinking
Modern astronomy has become a discipline driven by data science and computing (Lundgren & Trainor, 2023; Norman et al., 2019; Rebull, 2024; Taghizadeh-Popp et al., 2020). The astronomy research community has come to accept that to develop future astronomical researchers, early career scientists need to learn data science and computer science (Norman et al., 2019; Lundgren & Trainor, 2023). In undergraduate physics courses, the integration of computing is already happening (Apple et al., 2021; Hutchins et al., 2020; Lee et al., 2020; Orban & Teeling-Smith, 2020; Weintrop et al., 2016; Weller et al., 2021). It is time to bring data science and computational thinking into K-12 classes. It is possible to bring modern data science (Bargagliotti et al., 2020; Israel-Fishelson et al., 2024) and computational thinking (Orban & Teeling-Smith, 2020; Weintrop et al., 2016) into high schools science courses like astronomy and physics (Lundgren et al., 2019; Newland, 2020; Rebull, 2024).
Data Science Education Frameworks

One way to design data science activities for the classroom is to use the Pre-K-12 Guidelines for Assessment and Instruction in Statistics Education II (GAISE II) framework (Bargaglotti et al., 2020). The GAISE II framework lays out the statistical problem-solving process in four steps, which can be iteratively applied. The first step is for students to formulate one or more statistical investigative questions. Next, students should collect or consider data that helps them answer the question they asked. Analyzing the data is the next step in the process for students. Finally, students should interpret and communicate their results, including possible alternative explanations.
The Computational Thinking in Math and Science (CMTS) taxonomy (Weintrop et al., 2016) defines computational thinking practices used in secondary science and math classrooms. The CTMS taxonomy lays out four practices that can exist alone or in concert with the other practices. The CTMS data practices align very closely with the GAISE II framework.
Using both the GAISE II framework and the CTMS taxonomy, curriculum designers can use traditional spreadsheets (Israel-Fishelson et al., 2024), data science-specific tools like CODAP (Sullivan, 2022) or Desmos (MacIsaac, 2016), or tools that involve programming, like Google Colab or Jupyter notebook (LaMee, 2021; Newland, 2020). The tools used can vary across activities and classrooms, but the statistical problem-solving process is used (Bargaglotti et al., 2020).
Public Science Archives in Science Education
There are plenty of publicly available astronomical sources for finding data. The high school astronomy classroom activities shared here draw on three large-scale datasets – the Sloan Digital Sky Survey (SDSS) (Kollmeier et al., 2017), the NASA/IPAC Infrared Science Archive (IRSA) (Rebull, 2024), and the NASA Exoplanet Archive (Howell et al., 2014).
Understanding how orbits are modeled using Kepler’s laws is a fundamental part of most astronomy and physics courses in high school. The first activity involves students constructing knowledge about Kepler’s laws using the NASA exoplanet archive data. The Kepler’s 3rd law data activity is presented using spreadsheets, the Desmos web-based calculator, and a Google Colab (Jupyter) notebook. A separate page is dedicated to the different versions of Kepler’s 3rd law activity. The statistical problem-solving process is used in each activity through linear regression to ask what the host star’s mass is compared to our Sun.
Learning Orbits with Regression
- Kepler’s 3rd Law – Data Regression with Spreadsheet Activity
- Regression & algebra to teach orbits with NASA exoplanet dataset
- Use spreadsheets to reduce tabular data

- Kepler’s 3rd Law – Data Regression with Desmos Activity
- Scaffolded regression & list tools
- Blends spreadsheets and coding
- Reduced cognitive load

- Kepler’s 3rd Law – Python for Data Science Activity
- Computational essay format (Odden et al., 2023)
- Scaffolded coding experience
- Regression & visualization with Python

Infrared Science Archive (IRSA) as Data Pedagogy
- Vast astronomical research archive
- Web-based real-time data interrogation
Using Skynet Images with IRSA Catalogs – Query Across Databases
- Activity designed using robotic telescope images to query IRSA holdings
- Cross-match sources from images
- Use histograms to determine distance

Creating CMD with IRSA & Gaia – Stellar Population Discovery
- Intuitive evolution of relationships
- HR diagram through discovery
- Data-driven learning

Creating 3-color Images with IRSA – Multi-wavelength Data Interrogation
- See “invisible” infrared data through image construction

Computer Science & Data Science
- Computational essay format
- Authentic techniques with Python
Hertzsprung-Russell Diagram Introduction
- Large authentic dataset from CSV
- Use of pandas dataframes

Relative Abundance via Spectroscopy
- Scaffolded spectra reduction
- Statistical metric developed in context

Universal Expansion using Galaxy Data
- Develop Hubble’s Law empirically
- Scaffolded statistical thinking in context

The web-based NASA/IPAC IRSA tools are not designed for classroom teachers, but teachers still need to create activities using the rich data available. The Skynet and IRSA Nebula activity has students asking about the distance of the star clusters inside the nebulae from Earth. The Creating Color-Magnitude Diagrams using IRSA activity has students using statistical thinking to build visualizations of stellar populations to characterize the ages and stellar makeup of clusters. The 3-Color Astro Image with IRSA activity has students using actual data to create images of objects in space, which can be used to understand their astrophysical nature. The NASA/IPAC IRSA tools also access SDSS data and other extensive data archives, allowing students to cross-compare astrophysical information about objects from different catalogs.
Python can be leveraged to use industry-standard data science practices in the classroom. Introducing the Hertzsprung-Russell (HR) diagram is a chance to show statistical thinking in one of the fundamental tools of astronomy. The version presented here expands on the HR diagram activity to have students ask statistical thinking questions about the data and the visualizations they create. The Relative Abundance of Europium activity, in both spreadsheet form and coding form, has students explore the atomic contents of real stars using statistical thinking about central trends of distributions of atoms in stellar spectra. The Measuring Stellar Distances with Light activity has students using the power of coding to perform photometry calculations, which can lead to distance calculations for a star cluster using actual astronomical images. Finally, the Hubble Diagram uses SDSS data in both the spreadsheet and Google Colab versions. It has students querying the SDSS archives so that they can use a variety of statistical thinking tools to determine the age of the universe or the distance of a galaxy cluster from our solar system.
All shared activities have been used in high school astronomy classes. The lesson plans, the data, the code, and, in most cases, the solutions for teacher use are freely available and released under a Creative Commons license.
Acknowledgments
All the activities described here were used in my classroom over multiple iterations. But they were not developed in a vacuum. I want to acknowledge the continued support of Dr. Keely Finkelstein from the Department of Astronomy at the University of Texas at Austin, who has provided many professional development experiences related to astronomy over the years. I also owe a lot to Dr. Chris Sneden from the Department of Astronomy at the University of Texas at Austin for the opportunities he provided to learn authentic astronomy research techniques. The relative abundance activity is my work, as well as that of Justin Hickey from Episcopal High School, Olivia Kuper from North Greene High School, and Eileen Grzybowski, who retired from Norman High School. All activities using the Infrared Science Archive (IRSA) would not exist without the help of Dr. Luisa Rebull from the Infrared Processing and Analysis Center (IPAC) at the California Institute of Technology and the teachers from the NASA/IPAC Teacher Archive Research Program (NITARP) and the Big NITARP Alumni Research Project (BINAP). I would not have learned to use Python for science, much less use it in my teaching, if not for Dr. Sean Johnson from the University of Michigan. Incidentally, I was Sean’s AP Computer Science Teacher a decade before, so I taught him C++ before he taught me Python. The Hubble’s-Lamaître’s Law activity would not exist without help from Sean and Dr. Britt Lundgren from the University of North Carolina Ashville. I would also like to thank the developers of the Sloan Digital Sky Survey (SDSS) Voyages activities who inspired the Hubble’s law activity. The Hertzsprung-Rusell introductory Python activity is based on work by Adam LaMee and his materials at CODINGinK12. Lastly, I borrowed data science visualization ideas from Stephen Shadle at the Fountain Valley School.
References
- Apple, L., Baunach, J., Connelly, G., Gahlhoff, S., Romanowicz, C. M., Vieyra, R. E., & Walker, L. (2021). Computational Modeling in High School Physics First: Postcards from the Edge. The Physics Teacher, 59(7), 535–539. https://doi.org/10.1119/10.0006458
- Bargagliotti, A., Franklin, C., Arnold, P., Johnson, S., Perez, L., Spangler, D. A., & Gould, R. (n.d.). Pre-K-12 Guidelines for Assessment and Instruction in Statistics Education II (GAISE II) A Framework for Statistics and Data Science Education Writing Committee The Pre-K-12 Guidelines for Assessment and Instruction. Retrieved July 4, 2024, from https://www.amstat.org/asa/files/pdfs/GAISE/GAISEIIPreK-12_Full.pdf
- Howell, S. B., Sobeck, C., Haas, M., Still, M., Barclay, T., Mullally, F., Troeltzsch, J., Aigrain, S., Bryson, S. T., Caldwell, D., Chaplin, W. J., Cochran, W. D., Huber, D., Marcy, G. W., Miglio, A., Najita, J. R., Smith, M., Twicken, J. D., & Fortney, J. J. (2014). The K2 Mission: Characterization and Early Results. Publications of the Astronomical Society of the Pacific, 126(938), 398–408. https://doi.org/10.1086/676406
- Hutchins, N. M., Biswas, G., Maróti, M., Lédeczi, Á., Grover, S., Wolf, R., Blair, K. P., Chin, D., Conlin, L., Basu, S., & McElhaney, K. (2020). C2STEM: A System for Synergistic Learning of Physics and Computational Thinking. Journal of Science Education and Technology, 29(1), 83–100. https://doi.org/10.1007/s10956-019-09804-9
- Israel-Fishelson, R., Moon, P., Tabak, R., & Weintrop, D. (2024). Understanding the Data in K-12 Data Science. Harvard Data Science Review, 6(2). https://doi.org/10.1162/99608f92.4f3ac3da
- Kollmeier, J., Anderson, S. F., Blanc, G. A., Blanton, M. R., Covey, K. R., Crane, J., Drory, N., Frinchaboy, P. M., Froning, C. S., Johnson, J. A., Kneib, J.-P., Kreckel, K., Merloni, A., Pellegrini, E. W., Pogge, R. W., Ramirez, S. V., Rix, H. W., Sayres, C., Sánchez-Gallego, J., … Weinberg, D. H. (2019). Astro2020 Project White Paper SDSS-V Pioneering Panoptic Spectroscopy Thematic Areas: Ground Based Project. https://baas.aas.org/pub/2020n7i274
- LaMee, A. (University of C. F. (2021). Teaching Science Content with Jupyter at Scale, Elementary Through University. American Association of Physics Teachers Virtual Winter Meeting 2021, 88–88. https://doi.org/10.48448/46br-xe05
- Lee, I., Grover, S., Martin, F., Pillai, S., & Malyn-Smith, J. (2020). Computational Thinking from a Disciplinary Perspective: Integrating Computational Thinking in K-12 Science, Technology, Engineering, and Mathematics Education. Journal of Science Education and Technology, 29(1), 1–8. https://doi.org/10.1007/S10956-019-09803-W
- Lundgren, B., & Trainor, R. (2023). ESCIP: A collaboration for developing and sharing educational Jupyter Notebooks. American Astronomical Society Meeting Abstracts, 241, 246.04. https://baas.aas.org/pub/2023n2i246p04/release/1
- Lundgren, B., Tojeiro, R., Beaton, R. L., Blanton, M. R., Borissova, J., Cano-Díaz, M., Grabowski, K., Kurtev, R., Macdonald, N., Majewski, S. R., Masters, K. L., Meredith, K., Nitschelm, C., O’Reilly, T., Raddick, J., Skinner, D., Thakar, A., Weijmans, A., & Whelan, D. G. (2019). Data-driven education and public outreach with the Sloan Digital Sky Survey. In BAAA (Vol. 61). http://voyages.sdss.org/es/
- MacIsaac, D. (2016). Desmos: A free cross-platform calculating and graphing tool. The Physics Teacher, 54(8), 509–509. https://doi.org/10.1119/1.4965284
- Newland, J. (2020). Teaching with Code: Globular Cluster Distance Lab. Research Notes of the AAS, 4(7), 118. https://doi.org/10.3847/2515-5172/aba953
- Norman, D., Cruz, K., Desai, V., Lundgren, B., Bellm, E., Economou, F., Smith, A., Bauer, A., Nord, B., Schafer, C., Narayan, G., Li, T., Tollerud, E., Sipőcz, B., Stevance, H., Pickering, T., Sinha, M., Harrington, J., Kartaltepe, J., … Dong, C. (2019). The Growing Importance of a Tech Savvy Astronomy and Astrophysics Workforce. Bulletin of the AAS, 51(7). https://baas.aas.org/pub/2020n7i018
- Odden, T. O. B., Silvia, D. W., & Malthe-Sørenssen, A. (2023). Using computational essays to foster disciplinary epistemic agency in undergraduate science. Journal of Research in Science Teaching, 60(5), 937–977. https://doi.org/10.1002/tea.21821
- Orban, C. M., & Teeling-Smith, R. M. (2020). Computational Thinking in Introductory Physics. The Physics Teacher, 58(4), 247–251. https://doi.org/10.1119/1.5145470
- Rebull, L. M. (2024). Astronomy data in the classroom. Physics Today, 77(2), 44–50. https://doi.org/10.1063/pt.vlhh.iudp
- Sullivan, P. (2022). Using CODAP to Grow Students’ Probabilistic Reasoning. Mathematics Teacher: Learning and Teaching PK-12, 115(4), 283–293. https://doi.org/10.5951/mtlt.2021.0103
- Taghizadeh-Popp, M., Kim, J. W., Lemson, G., Medvedev, D., Raddick, M. J., Szalay, A. S., Thakar, A. R., Booker, J., Chhetri, C., Dobos, L., & Rippin, M. (2020). SciServer: A science platform for astronomy and beyond. Astronomy and Computing, p. 33, 100412. https://doi.org/10.1016/j.ascom.2020.100412
- Weintrop, D., Beheshti, E., Horn, M., Orton, K., Jona, K., Trouille, L., & Wilensky, U. (2016). Defining Computational Thinking for Mathematics and Science Classrooms. Journal of Science Education and Technology, 25(1), 127–147. https://doi.org/10.1007/s10956-015-9581-5 Weller, D. P., Bott, T. E., Caballero, M. D., & Irving, P. W. (2021). Developing a learning goal framework for computational thinking in computationally integrated physics classrooms. 1–46. http://arxiv.org/abs/2105.07981
This work is licensed under a Creative Commons Attribution 4.0 International License.