re.compile not working well

169
March 29, 2018, at 11:17 AM

I have this list keywords to use:

keywords = ['a', 'about', 'advance', 'advanced', 'affect', 'after', 'ameliorate', 'among', 'and', 'any', 'apply', 'are', 'as', 'at', 'be', 'been', 'better', 'fix', 'fixed', 'following', 'for', 'form', 'from', 'from a', 'further', 'get', 'got', 'have', 'having', 'help', 'hike', 'hold', 'i', 'impact', 'improve', 'in',  'why', 'will', 'with', 'work with', 'would', 'you', 'your', 'of',]

Am using a simple sentence such as this:

'risk to healthy and fitness'
'risk of healthy and fitness'

My code is this:

keywords = keywords
def Searchy():
    name = 'risk to healthy and fitness'
    name33 = ['exercise','fit','fitness','cardio',]#standard words
    regex1 = re.compile(r'\b(%s+.])\b'%'|'.join(name33))
    regex2 = re.compile(r'\b(%s+.)\b'%'|'.join(keywords))
    h = [m.start()for m in re.finditer (regex1one,name)]
    name55 = [name[h[0]:]][0]
    print name55

I want to filter out most of the clutter, or words and just get the string starting from the first keyword with a result such as:

'to healthy and fitness'

If my first keyword is 'of' i get a correct string such as:

'of healthy and fitness'

If my first keyword is any other word used instead of 'of', i get this instead:

'healthy and fitness'

I want all results to be the same using all keywords. what could I be doing wrong and how do I get it right?

Answer 1

I think your issue is in regex1. You call name33, which is the looking through that list/string and is giving you everything after it. When I change it to name, it gives correct output.

def Searchy():
    keywords = ['a', 'about', 'advance', 'advanced', 'affect', 'after', 'ameliorate', 'among', 'and', 'any', 'apply', 'are', 'as', 'at', 'be', 'been', 'better', 'fix', 'fixed', 'following', 'for', 'form', 'from', 'from a', 'further', 'get', 'got', 'have', 'having', 'help', 'hike', 'hold', 'i', 'impact', 'improve', 'in',  'why', 'will', 'with', 'work with', 'would', 'you', 'your', 'of',]
    name = 'risk to healthy and fitness'
    name33 = ['exercise','fit','fitness','cardio',]#standard words
    regex1 = re.compile(r'\b(%s+.])\b'%'|'.join(name))
    regex2 = re.compile(r'\b(%s+.)\b'%'|'.join(keywords))
    h = [m.start()for m in re.finditer (regex1,name)]
    name55 = [name[h[0]:]][0]
    print name55
Searchy()

Also, you have regex1one in you h statement. I changed it to regex1

Answer 2

Your code works exactly as you wrote it:

If my first keyword is 'of' i get a correct string

Yes, because 'of' is indeed in your keyword list.

If my first keyword is any other word used instead of 'of', i get this instead

Yes, because in the example you gave, the only words before 'healthy and fitness' are 'risk', 'to' and 'of', out of which, only 'of' is in the keyword list you provided. If you wish to get the same result for the second example, you'll need to add 'to' to the keyword list

Rent Charter Buses Company
READ ALSO
How to divide class source files even though one module

How to divide class source files even though one module

For management, I want to divide source files at both aspects of namespace and directory

228
Using Unicode as keys in Python Dictionary throws key error [on hold]

Using Unicode as keys in Python Dictionary throws key error [on hold]

I'm using emojis and regular strings as keys for a dictionary in Python for a Discord bot and it randomly worksWell it works perfectly on Windows (dev/beta host) but it's really unreliable in Linux (prod on a VPS)

286
python variable scope like c [on hold]

python variable scope like c [on hold]

in C language we can define a scope by { }, I thought Python are similarAfter running the code below, I know I am wrong:

209
python lambda assign variable

python lambda assign variable

I am new with Python and I have tried several workarounds to do an assignment in a lambda :

225