Background
Interacting with existing astronomical datasets can transform how students learn astronomy while developing valuable data science skills for the larger world. Data can come from large-scale surveys, space-based observatories, individual scientists, or students. Students can learn to select, reduce, visualize, and interpret authentic astronomical data while applying data science techniques to construct astronomy knowledge. Many free web-based tools are available for teachers to leverage when integrating data science pedagogy into the classroom. This page discusses how to use the Infrared Science Archive (IRSA), the Sloan Digital Sky Survey (SDSS), and other publicly available data archives in data science contexts in the astronomy classroom. Several classroom-tested activities are linked below. Some activities leverage computer programming, while others involve no programming at all. All activities are freely available, so educators can expand and interpret the activities as needed. This page discusses how these activities connect data science pedagogy and astronomy concepts in a typical astro 101 course.
Using Archival Data alongside Computational Thinking
Modern astronomy has become a discipline driven by data science and computing (Lundgren & Trainor, 2023; Norman et al., 2019; Rebull, 2024; Taghizadeh-Popp et al., 2020). The astronomy research community has come to accept that to develop future astronomical researchers, early career scientists need to learn data science and computer science (Norman et al., 2019; Lundgren & Trainor, 2023). In undergraduate physics courses, the integration of computing is already happening (Apple et al., 2021; Hutchins et al., 2020; Lee et al., 2020; Orban & Teeling-Smith, 2020; Weintrop et al., 2016; Weller et al., 2021). It is time to bring data science and computational thinking into K-12 classes. It is possible to bring modern data science (Bargagliotti et al., 2020; Israel-Fishelson et al., 2024) and computational thinking (Orban & Teeling-Smith, 2020; Weintrop et al., 2016) into high schools science courses like astronomy and physics (Lundgren et al., 2019; Newland, 2020; Rebull, 2024).
One way to design data science activities for the classroom is to use the Pre-K-12 Guidelines for Assessment and Instruction in Statistics Education II (GAISE II) framework (Bargaglotti et al., 2020). The GAISE II framework lays out the statistical problem-solving process in four steps, which can be iteratively applied. The first step is for students to formulate one or more statistical investigative questions. Next, students should collect or consider data that helps them answer the question they asked. Analyzing the data is the next step in the process for students. Finally, students should interpret and communicate their results, including possible alternative explanations.
The Computational Thinking in Math and Science (CMTS) taxonomy (Weintrop et al., 2016) defines computational thinking practices used in secondary science and math classrooms. The CTMS taxonomy lays out four practices that can exist alone or in concert with the other practices. The CTMS data practices align very closely with the GAISE II framework.
Using both the GAISE II framework and the CTMS taxonomy, curriculum designers can use traditional spreadsheets (Israel-Fishelson et al., 2024), data science-specific tools like CODAP (Sullivan, 2022) or Desmos (MacIsaac, 2016), or tools that involve programming, like Google Colab or Jupyter notebook (LaMee, 2021; Newland, 2020). The tools used can vary across activities and classrooms, but the statistical problem-solving process is used (Bargaglotti et al., 2020).
There are plenty of publicly available astronomical sources for finding data. The high school astronomy classroom activities shared here draw on three large-scale datasets – the Sloan Digital Sky Survey (SDSS) (Kollmeier et al., 2017), the NASA/IPAC Infrared Science Archive (IRSA) (Rebull, 2024), and the NASA Exoplanet Archive (Howell et al., 2014).
Understanding how orbits are modeled using Kepler’s laws is a fundamental part of most astronomy and physics courses in high school. The first activity involves students constructing knowledge about Kepler’s laws using the NASA exoplanet archive data. The Kepler’s 3rd law data activity is presented using spreadsheets, the Desmos web-based calculator, and a Google Colab (Jupyter) notebook. The statistical problem-solving process is used in each activity through linear regression to ask what the host star’s mass is compared to our Sun.
The web-based NASA/IPAC IRSA tools are not designed for classroom teachers, but teachers still need to create activities using the rich data available. The Skynet and IRSA Nebula activity has students asking about the distance of the star clusters inside the nebulae from Earth. The Creating Color-Magnitude Diagrams using IRSA activity has students using statistical thinking to build visualizations of stellar populations to characterize the ages and stellar makeup of clusters. The 3-Color Astro Image with IRSA activity has students using actual data to create images of objects in space, which can be used to understand their astrophysical nature. The NASA/IPAC IRSA tools also access SDSS data and other extensive data archives, allowing students to cross-compare astrophysical information about objects from different catalogs.
Python can be leveraged to use industry-standard data science practices in the classroom. Introducing the Hertzsprung-Russell (HR) diagram is a chance to show statistical thinking in one of the fundamental tools of astronomy. The version presented here expands on the HR diagram activity to have students ask statistical thinking questions about the data and the visualizations they create. The Relative Abundance of Europium activity, in both spreadsheet form and coding form, has students explore the atomic contents of real stars using statistical thinking about central trends of distributions of atoms in stellar spectra. The Measuring Stellar Distances with Light activity has students using the power of coding to perform photometry calculations, which can lead to distance calculations for a star cluster using actual astronomical images. Finally, the Hubble Diagram uses SDSS data in both the spreadsheet and Google Colab versions. It has students querying the SDSS archives so that they can use a variety of statistical thinking tools to determine the age of the universe or the distance of a galaxy cluster from our solar system.
All shared activities have been used in high school astronomy classes. The lesson plans, the data, the code, and, in most cases, the solutions for teacher use are freely available and released under a Creative Commons license.
References
- Apple, L., Baunach, J., Connelly, G., Gahlhoff, S., Romanowicz, C. M., Vieyra, R. E., & Walker, L. (2021). Computational Modeling in High School Physics First: Postcards from the Edge. The Physics Teacher, 59(7), 535–539. https://doi.org/10.1119/10.0006458
- Bargagliotti, A., Franklin, C., Arnold, P., Johnson, S., Perez, L., Spangler, D. A., & Gould, R. (n.d.). Pre-K-12 Guidelines for Assessment and Instruction in Statistics Education II (GAISE II) A Framework for Statistics and Data Science Education Writing Committee The Pre-K-12 Guidelines for Assessment and Instruction. Retrieved July 4, 2024, from https://www.amstat.org/asa/files/pdfs/GAISE/GAISEIIPreK-12_Full.pdf
- Howell, S. B., Sobeck, C., Haas, M., Still, M., Barclay, T., Mullally, F., Troeltzsch, J., Aigrain, S., Bryson, S. T., Caldwell, D., Chaplin, W. J., Cochran, W. D., Huber, D., Marcy, G. W., Miglio, A., Najita, J. R., Smith, M., Twicken, J. D., & Fortney, J. J. (2014). The K2 Mission: Characterization and Early Results. Publications of the Astronomical Society of the Pacific, 126(938), 398–408. https://doi.org/10.1086/676406
- Hutchins, N. M., Biswas, G., Maróti, M., Lédeczi, Á., Grover, S., Wolf, R., Blair, K. P., Chin, D., Conlin, L., Basu, S., & McElhaney, K. (2020). C2STEM: A System for Synergistic Learning of Physics and Computational Thinking. Journal of Science Education and Technology, 29(1), 83–100. https://doi.org/10.1007/s10956-019-09804-9
- Israel-Fishelson, R., Moon, P., Tabak, R., & Weintrop, D. (2024). Understanding the Data in K-12 Data Science. Harvard Data Science Review, 6(2). https://doi.org/10.1162/99608f92.4f3ac3da
- Kollmeier, J., Anderson, S. F., Blanc, G. A., Blanton, M. R., Covey, K. R., Crane, J., Drory, N., Frinchaboy, P. M., Froning, C. S., Johnson, J. A., Kneib, J.-P., Kreckel, K., Merloni, A., Pellegrini, E. W., Pogge, R. W., Ramirez, S. V., Rix, H. W., Sayres, C., Sánchez-Gallego, J., … Weinberg, D. H. (2019). Astro2020 Project White Paper SDSS-V Pioneering Panoptic Spectroscopy Thematic Areas: Ground Based Project. https://baas.aas.org/pub/2020n7i274
- LaMee, A. (University of C. F. (2021). Teaching Science Content with Jupyter at Scale, Elementary Through University. American Association of Physics Teachers Virtual Winter Meeting 2021, 88–88. https://doi.org/10.48448/46br-xe05
- Lee, I., Grover, S., Martin, F., Pillai, S., & Malyn-Smith, J. (2020). Computational Thinking from a Disciplinary Perspective: Integrating Computational Thinking in K-12 Science, Technology, Engineering, and Mathematics Education. Journal of Science Education and Technology, 29(1), 1–8. https://doi.org/10.1007/S10956-019-09803-W
- Lundgren, B., & Trainor, R. (2023). ESCIP: A collaboration for developing and sharing educational Jupyter Notebooks. American Astronomical Society Meeting Abstracts, 241, 246.04. https://baas.aas.org/pub/2023n2i246p04/release/1
- Lundgren, B., Tojeiro, R., Beaton, R. L., Blanton, M. R., Borissova, J., Cano-Díaz, M., Grabowski, K., Kurtev, R., Macdonald, N., Majewski, S. R., Masters, K. L., Meredith, K., Nitschelm, C., O’Reilly, T., Raddick, J., Skinner, D., Thakar, A., Weijmans, A., & Whelan, D. G. (2019). Data-driven education and public outreach with the Sloan Digital Sky Survey. In BAAA (Vol. 61). http://voyages.sdss.org/es/
- MacIsaac, D. (2016). Desmos: A free cross-platform calculating and graphing tool. The Physics Teacher, 54(8), 509–509. https://doi.org/10.1119/1.4965284
- Newland, J. (2020). Teaching with Code: Globular Cluster Distance Lab. Research Notes of the AAS, 4(7), 118. https://doi.org/10.3847/2515-5172/aba953
- Norman, D., Cruz, K., Desai, V., Lundgren, B., Bellm, E., Economou, F., Smith, A., Bauer, A., Nord, B., Schafer, C., Narayan, G., Li, T., Tollerud, E., Sipőcz, B., Stevance, H., Pickering, T., Sinha, M., Harrington, J., Kartaltepe, J., … Dong, C. (2019). The Growing Importance of a Tech Savvy Astronomy and Astrophysics Workforce. Bulletin of the AAS, 51(7). https://baas.aas.org/pub/2020n7i018
- Orban, C. M., & Teeling-Smith, R. M. (2020). Computational Thinking in Introductory Physics. The Physics Teacher, 58(4), 247–251. https://doi.org/10.1119/1.5145470
- Rebull, L. M. (2024). Astronomy data in the classroom. Physics Today, 77(2), 44–50. https://doi.org/10.1063/pt.vlhh.iudp
- Sullivan, P. (2022). Using CODAP to Grow Students’ Probabilistic Reasoning. Mathematics Teacher: Learning and Teaching PK-12, 115(4), 283–293. https://doi.org/10.5951/mtlt.2021.0103
- Taghizadeh-Popp, M., Kim, J. W., Lemson, G., Medvedev, D., Raddick, M. J., Szalay, A. S., Thakar, A. R., Booker, J., Chhetri, C., Dobos, L., & Rippin, M. (2020). SciServer: A science platform for astronomy and beyond. Astronomy and Computing, p. 33, 100412. https://doi.org/10.1016/j.ascom.2020.100412
- Weintrop, D., Beheshti, E., Horn, M., Orton, K., Jona, K., Trouille, L., & Wilensky, U. (2016). Defining Computational Thinking for Mathematics and Science Classrooms. Journal of Science Education and Technology, 25(1), 127–147. https://doi.org/10.1007/s10956-015-9581-5 Weller, D. P., Bott, T. E., Caballero, M. D., & Irving, P. W. (2021). Developing a learning goal framework for computational thinking in computationally integrated physics classrooms. 1–46. http://arxiv.org/abs/2105.07981