About Question enthuware.ocpjp.v7.2.1425 :
Moderator: admin
-
- Posts: 2
- Joined: Wed Mar 27, 2013 6:48 am
- Contact:
About Question enthuware.ocpjp.v7.2.1425 :
Q: Which of the following patterns will correctly capture all Hex numbers that are delimited by at least one whitespace at either end in an input text?
A: (\s|\b)0[xX][0-9a-fA-F]+(\s|\b)
"0x22" does not contain any spaces, but the number will still be captured.
A: (\s|\b)0[xX][0-9a-fA-F]+(\s|\b)
"0x22" does not contain any spaces, but the number will still be captured.
Online
-
- Site Admin
- Posts: 10061
- Joined: Fri Sep 10, 2010 9:26 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
You need the delimiter if there are multiple numbers in the string. In your example, there is only one number, which matches the pattern, so there is no question of delimiter.
In other words, the question does not ask you to match white space. It asks you to use white space as a delimiter.
HTH,
Paul.
In other words, the question does not ask you to match white space. It asks you to use white space as a delimiter.
HTH,
Paul.
If you like our products and services, please help us by posting your review here.
-
- Posts: 2
- Joined: Wed Mar 27, 2013 6:48 am
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Then I think that question should clarify that input string must contain whitespace delimiters.
Online
-
- Site Admin
- Posts: 10061
- Joined: Fri Sep 10, 2010 9:26 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Hi Aleksey,
I am not sure what you mean because the question does say, "...are delimited by at least one whitespace...". So it is clear that whitespace is to be used as a delimiter.
HTH,
Paul.
I am not sure what you mean because the question does say, "...are delimited by at least one whitespace...". So it is clear that whitespace is to be used as a delimiter.
HTH,
Paul.
If you like our products and services, please help us by posting your review here.
-
- Posts: 132
- Joined: Thu May 16, 2013 9:23 am
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Would it make sense using & operator in regex? or better there is the chance of getting question on it:)?
The_Nick.
The_Nick.
-
- Posts: 7
- Joined: Sat Mar 15, 2014 5:00 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
"0x1+0x2" contains two hex numbers delimited by "+" but not a space. Nevertheless they are matched by the pattern.
The problem with the pattern is that there exist characters which are not a space but form a word boundary. This applies to all non-word characters (eg: "0x1@0x2").
The problem with the pattern is that there exist characters which are not a space but form a word boundary. This applies to all non-word characters (eg: "0x1@0x2").
Online
-
- Site Admin
- Posts: 10061
- Joined: Fri Sep 10, 2010 9:26 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
You are right. The pattern should be: (\s|^)0[xX][0-9a-fA-F]+(\s|$)
If you like our products and services, please help us by posting your review here.
-
- Posts: 7
- Joined: Sat Mar 15, 2014 5:00 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Thanks for your fast replies.
A problem of the new pattern is that it doesn't match two hex numbers separated by just one space, eg "0x1 0x2". Only the first hex number is matched because the space is already consumed by the first one.
To fix this I've added \G (the end of the previous match):
(^|\s|\G)0[xX][0-9a-fA-F]+(\s|$)
I am not sure if the \G operator needs to be known in the exam.
In practice I would put the hex number itself inside parenthesis (as a capturing group) to exclude the spaces from the match.
Regards
A problem of the new pattern is that it doesn't match two hex numbers separated by just one space, eg "0x1 0x2". Only the first hex number is matched because the space is already consumed by the first one.
To fix this I've added \G (the end of the previous match):
(^|\s|\G)0[xX][0-9a-fA-F]+(\s|$)
I am not sure if the \G operator needs to be known in the exam.
In practice I would put the hex number itself inside parenthesis (as a capturing group) to exclude the spaces from the match.
Regards
Online
-
- Site Admin
- Posts: 10061
- Joined: Fri Sep 10, 2010 9:26 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Are you sure, I just tried it and it matches correctly. Here is the code:
Output:
Code: Select all
Pattern pattern =
Pattern.compile("(\\s||^)0[xX][0-9a-fA-F]+(\\s||$)");
Matcher matcher = pattern.matcher("0x22 0x44");
while (matcher.find()) {
System.out.println("Found the text "+matcher.group()+" starting at " +matcher.start()+" and ending at index "+ matcher.end());
}
Code: Select all
Found the text 0x22 starting at 0 and ending at index 5
Found the text 0x44 starting at 5 and ending at index 9
If you like our products and services, please help us by posting your review here.
-
- Posts: 7
- Joined: Sat Mar 15, 2014 5:00 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Yes, I am sure. Somehow double-pipes (||) came into your code.
Try using
Try using
Code: Select all
Pattern pattern = Pattern.compile("(\\s|^)0[xX][0-9a-fA-F]+(\\s|$)");
Online
-
- Site Admin
- Posts: 10061
- Joined: Fri Sep 10, 2010 9:26 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
You are right. Not sure why || works. Must be fixed.
If you like our products and services, please help us by posting your review here.
-
- Posts: 7
- Joined: Sat Mar 15, 2014 5:00 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
(\\s||^) "works" because it means
* space
* or nothing
* or beginning of the line
The problem with it is that it matches other delimiters than space.
* space
* or nothing
* or beginning of the line
The problem with it is that it matches other delimiters than space.
-
- Posts: 33
- Joined: Mon May 06, 2013 9:41 am
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
In the first option example in fact 0x1a seems to be captured not only 0x1
In the second option description in "[a-zA-Z_0-9]" is there need for the underscore?
In the second option description in "[a-zA-Z_0-9]" is there need for the underscore?
-
- Posts: 35
- Joined: Mon Jul 28, 2014 2:05 am
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Regex is hard to understand for me, so please clarify one question..
regex "[\s\b]0[xX][0-9a-fA-F]+[\s\b]" won't compile. The error is "Illegal/unsupported escape sequence near index 4".
The square brakets ('[' and ']') mean "OR" or "RANGE", aren't they? So why is that ok: "[abc]", and that is not: "[\s\b]"?
regex "[\s\b]0[xX][0-9a-fA-F]+[\s\b]" won't compile. The error is "Illegal/unsupported escape sequence near index 4".
The square brakets ('[' and ']') mean "OR" or "RANGE", aren't they? So why is that ok: "[abc]", and that is not: "[\s\b]"?
Online
-
- Site Admin
- Posts: 10061
- Joined: Fri Sep 10, 2010 9:26 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
Not really sure why
If you like our products and services, please help us by posting your review here.
-
- Posts: 97
- Joined: Wed Dec 28, 2016 9:00 am
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
The explanation should have matcher.end()-1 to return the correct ending index since it returns past the index integer.
Online
-
- Site Admin
- Posts: 10061
- Joined: Fri Sep 10, 2010 9:26 pm
- Contact:
Re: About Question enthuware.ocpjp.v7.2.1425 :
In Java, the ending index is almost always one after after the last. For example, if you do substring(1, 3), it will return characters from index 1 and 2 (not 3). The ending character (or element in the case of a list) is excluded. The explanation just prints the value of match.end() from that perspective. Changing it from end()-1 will just cause confusion.
If you like our products and services, please help us by posting your review here.
Who is online
Users browsing this forum: No registered users and 98 guests