Finding correctly and incorrectly classified data

188
February 07, 2018, at 4:40 PM

I want to find the raw data which are classified successfully and which are not classified after Multinomial Nieves Bayes Classification algorithm is applied. For instance I got the accuracy as 88% after applying Multinomail Naives Bayes classification. I want to know the 12% of data which are not classified and also 88% of the data that is classified. Thanks in advance

My data set:

+----------------------+------------+
| Details              | Category   |
+----------------------+------------+
| Any raw text1        | cat1       |
+----------------------+------------+
| any raw text2        | cat1       |
+----------------------+------------+
| any raw text5        | cat2       |
+----------------------+------------+
| any raw text7        | cat1       |
+----------------------+------------+
| any raw text8        | cat2       |
+----------------------+------------+
| Any raw text4        | cat4       |
+----------------------+------------+
| any raw text5        | cat4       |
+----------------------+------------+
| any raw text6        | cat3       |
+----------------------+------------+

My code:

import pandas as pd
import numpy as np
import scipy as sp
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt  
from sklearn.model_selection import train_test_split 
data= pd.read_csv('mydat.xls', delimiter='\t',usecols=
['Details','Category'],encoding='utf-8')
target_one=data['Category']
target_list=data['Category'].unique()         
x_train, x_test, y_train, y_test = train_test_split(data.Details, 
data.Category, random_state=42)
vect = CountVectorizer(ngram_range=(1,2))
#converting traning features into numeric vector
X_train = vect.fit_transform(x_train.values.astype('U'))
#converting training labels into numeric vector
X_test = vect.transform(x_test.values.astype('U'))
# start = time.clock()
mnb = MultinomialNB(alpha =0.13)
mnb.fit(X_train,y_train)
result= mnb.predict(X_test)

# mnb.predict_proba(x_test)[0:10,1]
accuracy_score(result,y_test)
READ ALSO
lxml: Get field after attribute value

lxml: Get field after attribute value

I'm parsing XML files and I have a follow-on question from hereFrom the below XML field:

346
How to substract “hue” of an image from the original image using OPENCV

How to substract “hue” of an image from the original image using OPENCV

I am trying to subtract the "hue" part of an image from that imageI have extracted all the h,s,v components

222
How to print a number of lines randomly from a text file (using python)

How to print a number of lines randomly from a text file (using python)

I want to be able to print a number of lines from my file randomly

269
Airflow fetch api data hourly - best practice [on hold]

Airflow fetch api data hourly - best practice [on hold]

i need Fetch hourly data form API using Airflow, and looking for best practice

150