How to split a string into a list?

Posted on

Solving problem is about exposing yourself to as many situations as possible like How to split a string into a list? and practice these strategies over and over. With time, it becomes second nature and a natural way you approach any problems in general. Big or small, always start with a plan, use other strategies mentioned here till you are confident and ready to code the solution.
In this post, my aim is to share an overview the topic about How to split a string into a list?, which can be followed any time. Take easy to follow this discuss.

How to split a string into a list?

I want my Python function to split a sentence (input) and store each word in a list. My current code splits the sentence, but does not store the words as a list. How do I do that?

def split_line(text):
    # split the text
    words = text.split()
    # for each word in the line:
    for word in words:
        # print the word
        print(words)
Asked By: Thanx

||

Answer #1:

text.split()

This should be enough to store each word in a list. words is already a list of the words from the sentence, so there is no need for the loop.

Second, it might be a typo, but you have your loop a little messed up. If you really did want to use append, it would be:

words.append(word)

not

word.append(words)
Answered By: nstehr

Answer #2:

Splits the string in text on any consecutive runs of whitespace.

words = text.split()

Split the string in text on delimiter: ",".

words = text.split(",")

The words variable will be a list and contain the words from text split on the delimiter.

Answered By: zalew

Answer #3:

str.split()

Return a list of the words in the string, using sep as the delimiter
… If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

>>> line="a sentence with a few words"
>>> line.split()
['a', 'sentence', 'with', 'a', 'few', 'words']
>>> 
Answered By: gimel

Answer #4:

Depending on what you plan to do with your sentence-as-a-list, you may want to look at the Natural Language Took Kit. It deals heavily with text processing and evaluation. You can also use it to solve your problem:

import nltk
words = nltk.word_tokenize(raw_sentence)

This has the added benefit of splitting out punctuation.

Example:

>>> import nltk
>>> s = "The fox's foot grazed the sleeping dog, waking it."
>>> words = nltk.word_tokenize(s)
>>> words
['The', 'fox', "'s", 'foot', 'grazed', 'the', 'sleeping', 'dog', ',',
'waking', 'it', '.']

This allows you to filter out any punctuation you don’t want and use only words.

Please note that the other solutions using string.split() are better if you don’t plan on doing any complex manipulation of the sentence.

[Edited]

Answered By: tgray

Answer #5:

How about this algorithm? Split text on whitespace, then trim punctuation. This carefully removes punctuation from the edge of words, without harming apostrophes inside words such as we're.

>>> text
"'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'"
>>> text.split()
["'Oh,", 'you', "can't", 'help', "that,'", 'said', 'the', 'Cat:', "'we're", 'all', 'mad', 'here.', "I'm", 'mad.', "You're", "mad.'"]
>>> import string
>>> [word.strip(string.punctuation) for word in text.split()]
['Oh', 'you', "can't", 'help', 'that', 'said', 'the', 'Cat', "we're", 'all', 'mad', 'here', "I'm", 'mad', "You're", 'mad']
Answered By: Colonel Panic

Answer #6:

I want my python function to split a sentence (input) and store each word in a list

The str().split() method does this, it takes a string, splits it into a list:

>>> the_string = "this is a sentence"
>>> words = the_string.split(" ")
>>> print(words)
['this', 'is', 'a', 'sentence']
>>> type(words)
<type 'list'> # or <class 'list'> in Python 3.0

The problem you’re having is because of a typo, you wrote print(words) instead of print(word):

Renaming the word variable to current_word, this is what you had:

def split_line(text):
    words = text.split()
    for current_word in words:
        print(words)

..when you should have done:

def split_line(text):
    words = text.split()
    for current_word in words:
        print(current_word)

If for some reason you want to manually construct a list in the for loop, you would use the list append() method, perhaps because you want to lower-case all words (for example):

my_list = [] # make empty list
for current_word in words:
    my_list.append(current_word.lower())

Or more a bit neater, using a list-comprehension:

my_list = [current_word.lower() for current_word in words]
Answered By: dbr

Answer #7:

shlex has a .split() function. It differs from str.split() in that it does not preserve quotes and treats a quoted phrase as a single word:

>>> import shlex
>>> shlex.split("sudo echo 'foo && bar'")
['sudo', 'echo', 'foo && bar']
Answered By: Tarwin

Answer #8:

If you want all the chars of a word/sentence in a list, do this:

print(list("word"))
#  ['w', 'o', 'r', 'd']
print(list("some sentence"))
#  ['s', 'o', 'm', 'e', ' ', 's', 'e', 'n', 't', 'e', 'n', 'c', 'e']
Answered By: BlackBeard

Leave a Reply

Your email address will not be published.