We only strive to present the clearest of data. After we cleaned our data, we hand picked out the best to create comprehensible graphs. Below includes our scatter plots, correlation maps, and circle plots.
A correlation matrix is a table showing the relationship between variables. Correlation matrices are used to summarize data and as diagnostic tools for advanced analyses. A negative correlation means that the two variables have little to no relevancy to each other, whereas a positive correlation shows that the variables effect each other.
This pie chart shows the percentages of the non-hazardous and hazardous asteroids in the dataset we used. A problem that we encountered when building our models is that there were more data entries for the non-hazardous asteroids than the hazardous ones, causing an imbalance. This imbalance cause our accuracy metrics to lower significantly. So we decided to fix it by using the random over-sampling method, which adds more copies to the minority class (in our case, the hazardous asteroids.)
When comparing the two pie charts above, we can see that the random over-sampling method did it's job and made copies of the hazardous data until it was equal to the non-hazardous asteroids.
This box plot shows us that the hazardous asteroids have a higher max, min and median than the non-hazardous asteroids.
This box plot shows us that the hazardous asteroids have a lower max, min and median than the non-hazardous asteroids.
This box plot shows us that the hazardous asteroids and the non-hazardous asteroids have almost equal median, min and max when it comes to miss distance, which means it doesn't contribute to whether or not the asteroid is dangerous.
The minimum diameter holds a constant correlation with the relative velocity.
The maximum diameter holds a constant correlation with the relative velocity.
The hazardous and non-hazardous asteroids both fall into a certain range with the velocity correlation. The hazardous asteroids have about the range of [15,23] and the non-hazardous asteroid portray about a range [14,32].