machine learning - Multiple features into one using Pipeline and featureUnion from Python Scikit-learn -


i train , predict gender of person. have 2 features 'name' , 'randint' each coming different pandas column. trying 1) combine them pipeline/featureunion. 2) adding predicted label onto original pandas data frame. though getting error former objective 1):

from sklearn.feature_extraction.text import countvectorizer sklearn.linear_model import logisticregressioncv sklearn.cross_validation import train_test_split sklearn.base import transformermixin import pandas pd sklearn.feature_extraction import dictvectorizer sklearn.pipeline import make_pipeline sklearn.pipeline import featureunion import numpy np  clf = make_pipeline(countvectorizer(), logisticregressioncv(cv=2))  data = {     'bruce lee': 'male',     'bruce banner': 'male',     'peter parker': 'male',     'peter poker': 'male',     'peter springsteen': 'male',     'bruce willis': 'male',     'sarah mclaughlin': 'female',     'sarah silverman': 'female',     'sarah palin': 'female',     'sarah hyland': 'female',     'bruce li': 'male',     'bruce milk': 'male',     'bruce springsteen': 'male',     'bruce willis': 'male',     'sally juice': 'female',     'sarah silverwoman': 'female',     'sarah palin': 'female',     'sarah hyland': 'female',     'bruce paul': 'male',     'bruce lame': 'male',     'bruce springsteen': 'male',     'bruce willis': 'male',     'sarah willis': 'female',     'sarah goldman': 'female',     'sarah palin': 'female',     'sally hyland': 'female',     'bruce mcdonald': 'male',     'bruce lane': 'male',     'peter springsteen': 'male',     'bruce willis': 'male',     'sarah mclaughlin': 'female',     'sarah goldwoman': 'female',     'sarah palin': 'female',     'sarah hylie': 'female'     }  df = pd.dataframe.from_dict(data, orient='index').reset_index() df.columns = ['name', 'gender'] df['randomint'] = np.random.choice(range(1, 6), df.shape[0])  class extractnames(transformermixin):     def transform(self, x, *args):         return [{'first': name.split()[0],                  'last': name.split()[-1]}                 name in x]      def fit(self, *args):         return self  class extractrandint(transformermixin):     def transform(self, x2, *args):         return [{'randint': num} num in x2]      def fit(self, *args):         return self     trans = extractnames() trans2 = extractrandint() combined = featureunion([trans, trans2])  clf = make_pipeline(combined(), dictvectorizer(), logisticregressioncv()) df_train, df_test = train_test_split(df, train_size=0.5, random_state=68) clf.fit(df_train['name'], df_train['randomint'], df_train['gender']) 

error:

traceback (most recent call last):   file "c:\users\kubik\desktop\test5.py", line 74, in <module>     clf = make_pipeline(combined(), dictvectorizer(), logisticregressioncv()) typeerror: 'featureunion' object not callable 

you can't call () on combined object (you can call on classes because it's constructor, in combined object don't have __call__ method) line must be:

clf = make_pipeline(combined, dictvectorizer(), logisticregressioncv()) 

Comments

Popular posts from this blog

javascript - Slick Slider width recalculation -

jsf - PrimeFaces Datatable - What is f:facet actually doing? -

http - Safari render HTML as received -