Regular expression help - comma delimited string

187
January 18, 2022, at 3:50 PM

I don't write many regular expressions so I'm going to need some help on the one.

I need a regular expression that can validate that a string is an alphanumeric comma delimited string.

Examples:

  • 123, 4A67, GGG, 767 would be valid.
  • 12333, 78787&*, GH778 would be invalid
  • fghkjhfdg8797< would be invalid

This is what I have so far, but isn't quite right: ^(?=.*[a-zA-Z0-9][,]).*$

Any suggestions?

Answer 1

Sounds like you need an expression like this:

^[0-9a-zA-Z]+(,[0-9a-zA-Z]+)*$

Posix allows for the more self-descriptive version:

^[[:alnum:]]+(,[[:alnum:]]+)*$
^[[:alnum:]]+([[:space:]]*,[[:space:]]*[[:alnum:]]+)*$  // allow whitespace

If you're willing to admit underscores, too, search for entire words (\w+):

^\w+(,\w+)*$
^\w+(\s*,\s*\w+)*$  // allow whitespaces around the comma
Answer 2

Try this pattern: ^([a-zA-Z0-9]+,?\s*)+$

I tested it with your cases, as well as just a single number "123". I don't know if you will always have a comma or not.

The [a-zA-Z0-9]+ means match 1 or more of these symbols The ,? means match 0 or 1 commas (basically, the comma is optional) The \s* handles 1 or more spaces after the comma and finally the outer + says match 1 or more of the pattern.

This will also match 123 123 abc (no commas) which might be a problem This will also match 123, (ends with a comma) which might be a problem.

Answer 3

You seem to be lacking repetition. How about:

^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$

I'm not sure how you'd express that in VB.Net, but in Python:

>>> import re
>>> x [ "123, $a67, GGG, 767", "12333, 78787&*, GH778" ]
>>> r = '^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$'
>>> for s in x:
...    print re.match( r, s )
...
<_sre.SRE_Match object at 0xb75c8218>
None
>>>>

You can use shortcuts instead of listing the [a-zA-Z0-9 ] part, but this is probably easier to understand.

Analyzing the highlights:

  • [a-zA-Z0-9 ]+ : capture one or more (but not zero) of the listed ranges, and space.
  • (?:[...]+,)* : In non-capturing parenthesis, match one or more of the characters, plus a comma at the end. Match such sequences zero or more times. Capturing zero times allows for no comma.
  • [...]+ : capture at least one of these. This does not include a comma. This is to ensure that it does not accept a trailing comma. If a trailing comma is acceptable, then the expression is easier: ^[a-zA-Z0-9 ,]+
Answer 4

Yes, when you want to catch comma separated things where a comma at the end is not legal, and the things match to $LONGSTUFF, you have to repeat $LONGSTUFF:

$LONGSTUFF(,$LONGSTUFF)*

If $LONGSTUFF is really long and contains comma repeated items itself etc., it might be a good idea to not build the regexp by hand and instead rely on a computer for doing that for you, even if it's just through string concatenation. For example, I just wanted to build a regular expression to validate the CPUID parameter of a XEN configuration file, of the ['1:a=b,c=d','2:e=f,g=h'] type. I... believe this mostly fits the bill: (whitespace notwithstanding!)

xend_fudge_item_re = r"""
  e[a-d]x=          #register of the call return value to fudge
  (
    0x[0-9A-F]+ |   #either hardcode the reply
    [10xks]{32}     #or edit the bitfield directly
  )
"""
xend_string_item_re = r"""
  (0x)?[0-9A-F]+:   #leafnum (the contents of EAX before the call)
  %s                #one fudge
  (,%s)*            #repeated multiple times
""" % (xend_fudge_item_re, xend_fudge_item_re)
xend_syntax = re.compile(r"""
  \[                #a list of
   '%s'             #string elements
   (,'%s')*         #repeated multiple times
  \]
  $                 #and nothing else
""" % (xend_string_item_re, xend_string_item_re), re.VERBOSE | re.MULTILINE)
Answer 5

Try the following expression:

/^([a-z0-9\s]+,)*([a-z0-9\s]+){1}$/i

This will work for:

  1. test
  2. test, test
  3. test123,Test 123,test

I would strongly suggest trimming the whitespaces at the beginning and end of each item in the comma-separated list.

Answer 6

Try ^(?!,)((, *)?([a-zA-Z0-9])\b)*$

Step by step description:

  • Don't match a beginning comma (good for the upcoming "loop").
  • Match optional comma and spaces.
  • Match characters you like.
  • The match of a word boundary make sure that a comma is necessary if more arguments are stacked in string.
Answer 7

Please use - ^((([a-zA-Z0-9\s]){1,45},)+([a-zA-Z0-9\s]){1,45})$

Here, I have set max word size to 45, as longest word in english is 45 characters, can be changed as per requirement

Rent Charter Buses Company
READ ALSO
For the code given what will be stored in ArrayList? What are shallow pointers in Java?

For the code given what will be stored in ArrayList? What are shallow pointers in Java?

Suppose we have three Bear objects: momma, poppa, and babyWe create a "bears" ArrayList and add pointers to the three bears to the ArrayList

113
Change Default Download Location in Php

Change Default Download Location in Php

I want to change default download location in php Using header function I cant' find a parameter for that This is the definition of header

135
Base 62 conversion

Base 62 conversion

How would you convert an integer to base 62 (like hexadecimal, but with these digits: '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ')

173
Store/Update large file efficiently

Store/Update large file efficiently

I am working on a grocery app which needs to keep updating the most recent available quantitiesTo do so, I need to keep overwriting and accessing a large object of quantities which is sent via api every min

100