因此,我试图使这个程序,将要求用户输入,并将值存储在一个数组/列表。 然后,当输入空行时,它会告诉用户这些值中有多少是唯一的。 我做这个是出于现实生活的原因,而不是作为习题集。

enter: happy
enter: rofl
enter: happy
enter: mpg8
enter: Cpp
enter: Cpp
enter:
There are 4 unique words!

我的代码如下:

# ask for input
ipta = raw_input("Word: ")

# create list 
uniquewords = [] 
counter = 0
uniquewords.append(ipta)

a = 0   # loop thingy
# while loop to ask for input and append in list
while ipta: 
  ipta = raw_input("Word: ")
  new_words.append(input1)
  counter = counter + 1

for p in uniquewords:

..到目前为止我就知道这么多 我不知道如何计算一个列表中唯一的单词数? 如果有人可以发布解决方案,这样我就可以从中学习,或者至少向我展示它是如何伟大的,谢谢!


当前回答

另一种方法是用熊猫

import pandas as pd

LIST = ["a","a","c","a","a","v","d"]
counts,values = pd.Series(LIST).value_counts().values, pd.Series(LIST).value_counts().index
df_results = pd.DataFrame(list(zip(values,counts)),columns=["value","count"])

然后,您可以将结果导出为所需的任何格式

其他回答

另外,使用集合。计数器重构你的代码:

from collections import Counter

words = ['a', 'b', 'c', 'a']

Counter(words).keys() # equals to list(set(words))
Counter(words).values() # counts the elements' frequency

输出:

['a', 'c', 'b']
[2, 1, 1]

如何:

import pandas as pd
#List with all words
words=[]

#Code for adding words
words.append('test')


#When Input equals blank:
pd.Series(words).nunique()

它返回列表中有多少个唯一值

如果你想要一个唯一值的直方图,这里是lineer

import numpy as np    
unique_labels, unique_counts = np.unique(labels_list, return_counts=True)
labels_histogram = dict(zip(unique_labels, unique_counts))

使用集合:

words = ['a', 'b', 'c', 'a']
unique_words = set(words)             # == set(['a', 'b', 'c'])
unique_word_count = len(unique_words) # == 3

有了这个,你的解决方案可以很简单:

words = []
ipta = raw_input("Word: ")

while ipta:
  words.append(ipta)
  ipta = raw_input("Word: ")

unique_word_count = len(set(words))

print "There are %d unique words!" % unique_word_count

另一种方法是用熊猫

import pandas as pd

LIST = ["a","a","c","a","a","v","d"]
counts,values = pd.Series(LIST).value_counts().values, pd.Series(LIST).value_counts().index
df_results = pd.DataFrame(list(zip(values,counts)),columns=["value","count"])

然后,您可以将结果导出为所需的任何格式