Gaurav Kumar
3 min readJun 10, 2021

--

Task 05 👨🏻‍💻

Task Description 📄

📌 Create a blog/article/video about cyber crime cases where they talk about confusion matrix or its two types of error.

How confusion matrics can help ML models in Cyber Crime Investigation.

Cybercrime

Cybercrime is criminal activity that either targets or uses a computer, a computer network or a networked device.

Most, but not all, cybercrime is committed by cybercriminals or hackers who want to make money. Cybercrime is carried out by individuals or organizations.

Types of cybercrime

Here are some specific examples of the different types of cybercrime:

  • Email and internet fraud.
  • Identity fraud (where personal information is stolen and used).
  • Theft of financial or card payment data.
  • Cyberextortion (demanding money to prevent a threatened attack).
  • Cryptojacking (where hackers mine cryptocurrency using resources they do not own).
  • Cyberespionage (where hackers access government or company data).

CONFUSION MATRIX

A confusion matrix is a table that is often used to specify the performance of a classification model (or “classifier”) on a set of test data for which the Actual values are known to us. When we want to measure the effectiveness of our trained model. And it is where the Confusion matrix comes into the show . Confusion Matrix is a performance measurement for machine learning classification

The confusion matrix shows the ways in which the ML classification model is confused when it makes predictions. It is the performance summary of prediction results .

  • “true positive” for correctly predicted event values.
  • “false positive” for incorrectly predicted event values.
  • “true negative” for correctly predicted no-event values.
  • “false negative” for incorrectly predicted no-event values

The True cases are all good , but the false conditions are considered as errors . False Positive is type 1 error and False Negative is type 2 error.

Measuring Error

To understand how a model is performing, there are a variety of ways to measure the interplay of the types of conditions. A Confusion Matrix(yes, that is really what it is called) is used to present multiple types of error measurements so a data scientist can determine if the model is performing well or not. Below we will cover the following types of error measurements:

  • Specificity or True Negative Rate (TNR)
  • Precision, Positive Predictive Value (PPV)
  • Recall, Sensitivity, Hit Rate or True Positive Rate (TPR)
  • F Measure (F1,F0.5,F2)
  • Matthew’s Correlation Coefficient (MCC)
  • ROC Area (ROC AUC)
  • Fallout,False Positive Rate (FPR)
  • R², Coefficient of Determination (r²)
  • Root Mean Squared Error (RMSE)
  • Mean Absolute Error (MAE)

Precision

Precision measures how good our model is when the prediction is positive. It is the ratio of correct positive predictions to all positive predictions
precision = (TP) / (TP+FP)

TP is the number of true positives, and FP is the number of false positives.
A trivial way to have perfect precision is to make one single positive prediction and ensure it is correct (precision = 1/1 = 100%). This would not be very useful since the classifier would ignore all but one positive instance.

Recall:Recall goes another route. Instead of looking at the number of false positives the model predicted, recall looks at the number of false negatives that were thrown into the prediction mix.
recall = (TP) / (TP+FN)

--

--