attributes or closing tags)
This is what I'm using:
(\S[^\s/>]*)
Which, I think, reads as:
(any number of non-whitespace characters [up to a space, /, or >])
Is that correct? I can't get it to work.
If my text is:
<tag
then it returns "<tag" which is what I want.
However, if I have:
<tag/ or <tag
it instead matches "/" or ">" respectively.
Why?darrel wrote:
> I'm trying to find the opening < and the text of a tag (without the
> attributes or closing tags)
> This is what I'm using:
> (\S[^\s/>]*)
> Which, I think, reads as:
> (any number of non-whitespace characters [up to a space, /, or >])
> Is that correct? I can't get it to work.
> If my text is:
> <tag
> then it returns "<tag" which is what I want.
> However, if I have:
> <tag/ or <tag>
> it instead matches "/" or ">" respectively.
> Why?
In my brief testing, when run against "<tag/" it first matches "<tag" -
then the next match is "/". The second match matches "/" because it
matches the \S character class.
Post some examples of how you want the regex to behave, and maybe
someone can help put one together.
--
mikeb
> In my brief testing, when run against "<tag/" it first matches "<tag" -
> then the next match is "/". The second match matches "/" because it
> matches the \S character class.
But shouldn't this: [^/] stop it from doing that?
Here's how I want the regex to behave:
I want to find the first 'word' in the string. this would be any number of
characters in a row up to (but not including) a space, a new line, or a / or
so in this:
"hello there, how are you"
it should match 'hello'
in this:
"<blockquote>hello there, how are you"
it should match '<blockquote'
Thanks!
-Darrel
> But shouldn't this: [^/] stop it from doing that?
Aha. Mike, you are correct!
Here's what's happening. If this is my text:
<blockquote>monkey</blockquote
and this is my Regex:
\S[^>]*
It returns these matches:
<blockquote
>monkey</blockquote
So, it's returning the last match, I suppose. This is where I get lost. How
do I get it to ONLY return the first match?
Got it!
The problem was the very next group I was using.
I had this:
(\S[^\s/>]*)
but had to add another group:
(\s|\n[^\S>]*)|(>))
which checks for whitespace/new lines OR a closing tag.
-Darrel
Use the Match Class of the regular expression object
Dim m as Match = yourRegEx.Match(string)
m will return the first match
"darrel" wrote:
> > But shouldn't this: [^/] stop it from doing that?
> Aha. Mike, you are correct!
> Here's what's happening. If this is my text:
> <blockquote>monkey</blockquote>
> and this is my Regex:
> \S[^>]*
> It returns these matches:
> <blockquote
> >monkey</blockquote
> So, it's returning the last match, I suppose. This is where I get lost. How
> do I get it to ONLY return the first match?
>
>
0 comments:
Post a Comment