Regex: Only returns message string - That's starts with messages and string between parent message curly brace

152
July 01, 2022, at 5:00 PM

I want to get all the message data only. Such that it should look for message and all the data between curly braces of the parent message. With the below code, I am getting service details too along with message which I don't want. Any suggestion on this experts thanks in advance.

String data = "/**\r\n" + 
        " * file\r\n" + 
        " */\r\n" + 
        "syntax = \"proto3\";\r\n" + 
        "package demo;\r\n" + 
        "\r\n" + 
        "import \"envoyproxy/protoc-gen-validate/validate/validate.proto\";\r\n" + 
        "import \"google/api/annotations.proto\";\r\n" + 
        "import \"google/protobuf/wrappers.proto\";\r\n" + 
        "import \"protoc-gen-swagger/options/annotations.proto\";\r\n" + 
        "\r\n" + 
        "option go_package = \"bitbucket.com;\r\n" + 
        "option java_multiple_files = true;\r\n" + 
        "\r\n" + 
        "schemes: HTTPS;\r\n" + 
        "consumes: \"application/json\";\r\n" + 
        "produces: \"application/json\";\r\n" + 
        "responses: {\r\n" + 
        "key:\r\n" + 
        "    \"404\";\r\n" + 
        "value: {\r\n" + 
        "description:\r\n" + 
        "    \"not exist.\";\r\n" + 
        "schema: {\r\n" + 
        "json_schema: {\r\n" + 
        "type:\r\n" + 
        "    STRING;\r\n" + 
        "}\r\n" + 
        "}\r\n" + 
        "}\r\n" + 
        "}\r\n" + 
        "responses: {\r\n" + 
        "key:\r\n" + 
        "    \"401\";\r\n" + 
        "value: {\r\n" + 
        "description:\r\n" + 
        "    \"Wrong user.\";\r\n" + 
        "schema: {\r\n" + 
        "json_schema: {\r\n" + 
        "type:\r\n" + 
        "    STRING;\r\n" + 
        "};\r\n" + 
        "example: {\r\n" + 
        "value:\r\n" + 
        "    '{ \"message\": \"wrong user.\" }'\r\n" + 
        "}\r\n" + 
        "}\r\n" + 
        "}\r\n" + 
        "}\r\n" + 
        "\r\n" + 
        "message message1 {\r\n" + 
        "    message message2 {\r\n" + 
        "        enum Enum {\r\n" + 
        "            UNKNOWN = 0;    \r\n" + 
        "        }\r\n" + 
        "    }\r\n" + 
        "    string id = 1;\r\n" + 
        "    string name = 3;\r\n" + 
        "    string account = 4;\r\n" + 
        "}\r\n" + 
        "\r\n" + 
        "message User{\r\n" + 
        "   string firstName = 1 ;\r\n" + 
        "   string lastName  = 2 ;\r\n" + 
        "   string middleName  = 3 [(validate.rules).repeated = { min_items: 0 }];\r\n" + 
        "}\r\n" + 
        "\r\n" + 
        "service Userlogin{\r\n" + 
        "   rpc Login(User) returns (APIResponse);\r\n" + 
        "}";
List<String> allmsg = Arrays.asList(data.replaceAll("(?sm)\\A.*?(?=message)", "").split("\\R+(?=message)"));

I am expecting response like below in my array list of string with size 2.

allMsg.get(0) should be

message message1 {
    message message2 {
        enum Enum {
            UNKNOWN = 0;    
        }
    }
    string id = 1;
    string name = 3;
    string account = 4;
}

allMsg.get(1) should be

message User{
    string firstName = 1 ;
    string lastName  = 2 ;
    string middleName  = 3 [(validate.rules).repeated = { min_items: 0 }];
}
Answer 1

Use a Pattern that matches a "message" and stream the match results to a List:

List<String> allmsg = Pattern.compile("(?ms)^message.*?^}")
  .matcher(data)
  .results() // stream the MatchResults
  .map(MatchResult::group) // get the entire match
  .collect(toList()); // collect as a List

See live code demo.

Regex breakdown:

  • (?ms) turns on flags s, which makes dot also match newlines, and m, which makes ^ and $ match start and end of each line
  • ^message matches start of a line (not start of input, thanks to the m flag) then "message"
  • .*? reluctantly (ie as little as possible) matches any characters (including newlines, thanks to the s flag). Adding the ? to make the quantifier reluctant stops the match from consuming multiple "messages".
  • ^} matches start of a line (not start of input, thanks to the m flag) then "}"

See live regex demo.

This will work even if "messages" are not contiguous with each other, ie they may be interspersed with other constructs (your example doesn't have this situation, but the linked demos do).

Answer 2

You should see you other question.

Pattern.compile("(?s)^message(.(?!message|service))*");

If message can appear after message

"message message1 {\r\n" +

You must adapt the regex.

Rent Charter Buses Company
READ ALSO
Java 8 Mvn Project doesn&#39;t work on Java 17 [closed]

Java 8 Mvn Project doesn't work on Java 17 [closed]

Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problemThis will help others answer the question

104
Can&#39;t create table &quot;system_user&quot; in h2

Can't create table "system_user" in h2

i am having a problem creating a simple table in h2

91
E/ExtMediaPlayer-JNI: env-&gt;IsInstanceOf fails

E/ExtMediaPlayer-JNI: env->IsInstanceOf fails

I'm trying for a month to play an audio from https url, but every time I catch the same errorI'm new in android and I don't know what to try to fix this

112
Cyclic references when converting with MapStruct. Overflow error. Context does not work

Cyclic references when converting with MapStruct. Overflow error. Context does not work

I have 2 entities, with 1-to-1 association (ProfileEntity and VCardEntity)

118