AI Incident Classification
This work is now also being hosted on the MIT AI Risk Repository website.
All incidents in the AI Incident Database have been processed using an LLM and classified according to the MIT Risk Repository causal and domain taxonomies then scored for harm-severity on 10 different dimensions based on the CSET AI Harm Taxonomy, using a scale to reflect impact from zero to 'worst-case catastrophe'.
This is intended as a proof of concept to explore the potential capabilities and limitations of a scalable incident analysis framework.
This blog post discusses the background, the approach taken, preliminary results and next steps.
Please feel free to explore the analysis through the dashboard pages below and share feedback.
Home: Risk Classification - distribution of incident classifications (causal and domain) across the entire dataset
Record View - classification and harm severity scores for each individual record, including summary of reasoning and confidence in analysis. This supports filtering by any combination of data fields.
Timeline: Risk Classification - distribution of incident classifications by year
Timeline: Sub-domains - distribution of incident sub-domains by year
Timeline: High Severity Incidents - incidents with high direct harm severity scores by year
Timeline: High Severity Multiple Categories - incidents causing severe harm in more than one harm category
Timeline: Direct Harm Caused - distribution of harm severity scores by year
Example outputs: