Automating Course File Analysis: A Data-Driven Approach to Accessibility Reporting

The Problem: Managing Accessibility at Scale

With the increasing reliance on digital course materials, higher education institutions face growing challenges in ensuring accessibility compliance. A recent ruling on Title II of the Americans with Disabilities Act (ADA) requires all publicly funded organizations to meet WCAG 2.1 AA standards for digital content—including instructor materials, internal documents, and student-generated content.

At Michigan State University, we use D2L Brightspace as our learning management system (LMS), with Blackboard Ally (called Spartan Ally at MSU) integrated to monitor overall course accessibility and the accessibility of each course file/activity. As the lead of the Learning Technology & Development team in the Broad College of Business, I recognized the need for a scalable solution to track and analyze course accessibility in near real-time. The goal?

  • Ensure compliance with accessibility standards.
  • Provide faculty and administrators with actionable insights.
  • Efficiently allocate limited support resources to the areas with the greatest impact.

The Challenge: Messy Data

The Spartan Ally dataset includes course data spanning over a decade (since 2013), across thousands of courses from all colleges at MSU. However, raw data analysis was difficult due to:

  • Irrelevant terms & departments. The dataset includes every course from all colleges for all terms since 2013.
  • Inconsistent department names. Example: Accounting appears as “Accounting”, “Accounting; Michigan State University”, and “Michigan State University; Accounting”. Some courses are cross-listed across two departments, creating up to six different variations for the same combination of departments.
  • Merged courses with inconsistent names. Instructors can name merged courses anything they want, making automatic categorization difficult.
  • Lack of instructor information. Some courses have multiple instructors teaching different sections within the same term, while others switch instructors across semesters. We needed a way to determine which course iterations to keep in the report.

Manually cleaning and analyzing this data—especially handling merged courses—was extremely time-consuming, error prone, and inefficient. Automation was the only scalable solution. The natural solution was to develop a Python script.

However, it had been well over 5 years since I last wrote in Python. So, I used AI-assisted development, collaborating with ChatGPT as a real-time programming partner. I provided ChatGPT the original dataset (using a paid account for non-model-training data security) and explained in detail what I needed to accomplish, the steps of the manual process I had developed, and the known pitfalls I had previously encountered. The code that ChatGPT initially provided did have bugs and was note able to handle unexpected edge cases, so my past experience with Python programming was essential in identifying where issues were occurring and communicating that to ChatGPT. However, I wrote very, very little of the actual code. This experience showcased the power of AI not just as a tool, but as a development collaborator, bridging gaps in complex problem-solving and making data-driven automation more accessible.


Our Approach: A Python-Powered Solution

Our Python script (available on GitHub) is designed to filter, clean, and standardize data while intelligently handling merged courses and producing actionable reports. The script begins by filtering the data, removing unnecessary columns, selecting relevant semesters (e.g., Spring 2023 through Spring 2025), and retaining only key departments for reporting. It then cleans and standardizes the data by shortening department names (e.g., “SUPPLY CHAIN MANAGEMENT” becomes “SCM”) to simplify data visualization and generates “comparative course codes” to identify unique course-section combinations across semesters and merges. For instance, “SS25-ACC-100-001” is transformed into “ACC-100-001” for easier cross-term tracking.

The script also handles merged courses intelligently. In D2L, each course section has a unique numeric identifier (e.g., 223213573), and merged courses receive a combined code (e.g., “MERGED-[component course numeric identifiers]”). The script locates these component courses, generates the comparative course codes they would have had, and assigns a standardized code for the merged course, such as “SCM-474/MGT-231 MERGED,” eliminating inconsistent instructor-generated names.

To prioritize the most relevant data, the script sorts the instances of each course by academic term (Spring → Summer → Fall), identifies the two most recent instances, and determines whether both instances are informative to include. If the total number of files and the overall accessibility scores differ by 10% or more, both course instances are included; otherwise, only the most recent instance is retained, ensuring that redundant data is minimized while important changes are tracked over time. Finally, the script generates department-specific reports and a college-wide summary report for administrators while automating monthly reporting to make accessibility tracking consistent and scalable.


Considerations for Repurposing This Code at Other Colleges

This script is designed for flexibility, making it adaptable for other institutions with similar accessibility tracking needs. However, before using or modifying it, consider:

1. Data Format & Structure

  • If your dataset uses different column names (e.g., “Course Code”, “Total Files”, etc.), adjust the filtering logic accordingly.
  • If your institution uses different semester naming conventions, adjust the sorting logic accordingly.

2. Department Filtering

  • Update the list of relevant departments to match your college’s structure.
  • Modify department name standardization rules.

3. Merged Course Handling

  • If your institution doesn’t use merged courses, you can remove that logic.
  • If the identifier convention for merged courses is different, adjust the parsing logic accordingly.

4. Historical Comparison Metrics

  • We used Total Files and Overall Score to determine whether to keep one or two course instances. If other accessibility metrics are more relevant, update the comparison logic.

Final Thoughts

This project has transformed how we handle course accessibility reporting—turning a manual, tedious process into an automated, scalable workflow. With this system in place, we can generate clear, actionable reports for faculty and administrators on a regular basis (currently monthly), improving accessibility in our digital learning materials.

By sharing our approach, we hope other institutions can adapt and refine this method to meet their own needs, making accessibility reporting faster, smarter, and more effective.


Next Steps

Want to implement something similar at your institution? The code is available on GitHub and you are more than welcome to contact me to set up a short meeting with me to chat about how to adapt this approach for your needs!