Python RE Directories and slashes

96
February 01, 2022, at 02:20 AM

Let's say I have a string that is a root directory that has been entered

'C:/Users/Me/'

Then I use os.listdir() and join with it to create a list of subdirectories.

I end up with a list of strings that are like below:

'C:/Users/Me/Adir\Asubdir\'

and so on.

I want to split the subdirectories and capture each directory name as its own element. Below is one attempt. I am seemingly having issues with the \ and / characters. I assume \ is escaping, so '[\\/]' to me that says look for \ or / so then '[\\/]([\w\s]+)[\\/]' as a match pattern should look for any word between two slashes... but the output is only ['/Users/'] and nothing else is matched. So I then I add a escape for the forward slash.

'[\\\/]([\w\s]+)[\\\/]'

However, my output then only becomes ['Users','ADir'] so that is confusing the crud out of me.

My question is namely how do I tokenize each directory from a string using both \ and / but maybe also why is my RE not working as I expect?

Minimal Example:

import re, os
info = re.compile('[\\\/]([\w ]+)[\\\/]')

root = 'C:/Users/i12500198/Documents/Projects/'
def getFiles(wdir=os.getcwd()):
    files = (os.path.join(wdir,file) for file in os.listdir(wdir)
                 if os.path.isfile(os.path.join(wdir,file)))
    return list(files)
def getDirs(wdir=os.getcwd()):
    dirs = (os.path.join(wdir,adir) for adir in os.listdir(wdir)
                if os.path.isdir(os.path.join(wdir,adir)))
    return list(dirs)
def walkSubdirs(root,below=[]):
    subdirs = getDirs(root)
    for aDir in subdirs:
        below.append(aDir)
        walkSubdirs(aDir,below)       
        
    return below   
subdirs = walkSubdirs(root)
    
for aDir in subdirs:
    files = getFiles(aDir)
    for f in files:
        finfo = info.findall(f)
        print(f)
        print(finfo)
Answer 1

I want to split the subdirectories and capture each directory name as its own element

Instead of regular expressions, I suggest you use one of Python's standard functions for parsing filesystem paths.

Here is one using pathlib:

from pathlib import Path
p = Path("C:/Users/Me/ADir\ASub Dir\2 x 2 Dir\\")
p.parts
#=> ('C:\\', 'Users', 'Me', 'ADir', 'ASub Dir\x02 x 2 Dir')

Note that the behaviour of pathlib.Path depends on the system running Python. Since I'm on a Linux machine, I actually used pathlib.PureWindowsPath here. I believe the output should be accurate for those of you on Windows.

Rent Charter Buses Company
READ ALSO
findOne() in spring boot ordered by Asc

findOne() in spring boot ordered by Asc

I' am trying to use findOne() jpa query, now it's working fine with me but I want also the result extDetl to come orderd by vaccineDose

156
Select2 multiselect options to display options in another multiselect

Select2 multiselect options to display options in another multiselect

I have just switched one of my multiselects to select2 as my previous checkbox multiselect was incompatible with my new page template CSS and looked very out of place

52
SQLAlchemy + Mariadb freezes on initialization

SQLAlchemy + Mariadb freezes on initialization

I'm converting my program to SQLAlchemy, was using pymysql connector before and everything was working as it should, right now it just freeze on metadata creation, everything looks like this:

99
How to update a CSS Grid without re-rendering a specific component in ReactJS?

How to update a CSS Grid without re-rendering a specific component in ReactJS?

I'm working on a React website with a video player that should continuously play on all routesI'm trying to figure out a way to build a specific dynamic layout using CSS Grid that will update itself and the state of all components except the video component

103