Perl

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • regex..gah

    5 answers - 1206 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    "Dan" schreef:
    if (substr($line,0,5) eq "From:")
    You don't even need to know that 'From:' is 5 characters, if you use
    if ( 0 == index $line, 'From:' )
    i wrote a program some months back which utilised a compelx regex sub
    $onchan{lc($data[0])} =~ s/(,|^)\Q$data[1]\E(?=,|$)//;
    which substitutes the exact match for $data[1] in a long string which
    is csv, and replace it with nothing. i'm trying to use the same
    routine, or the same method, to get the 'dan' out of
    "dan" dan[@]domain.com
    but a) the regex confuses me enough to not know how to get that out
    of there (i've replaced the ,'s with "'s, but that's obviously not
    enough), and b) i don't know how to give the regex the full line of
    text, and assign the extracted value into a variable for use later on.
    Please give better examples of input and expected output. Use
    "example.com" in examples.
    This contains a regexp that will remove the non-@ characters from the
    start of a string:
    echo 'dan (AT) example (DOT) com' | perl -pe 's/^[^@]+//'
    See also: perldoc -q strip
  • No.1 | | 1511 bytes | |

    i been doing perl long enough that you'd think i should know this, but one
    thing i've never ever ever managed to get my head around is how regex works.

    i'm using net::pop3 (mail::pop3client doesn't work!), and i'm trying to
    extract certain data from the pop3 stream (from, subject, and some of the
    body eventually). but the regex behind matching the line required is just
    baffling me. i could cheat and say
    if (substr($line,0,5) eq "From:") { }, but i want to get past the phase of
    doing things the such a long way round, and learn more about regex.

    i wrote a program some months back which utilised a compelx regex sub
    $onchan{lc($data[0])} =~ s/(,|^)\Q$data[1]\E(?=,|$)//;
    which substitutes the exact match for $data[1] in a long string which is
    csv, and replace it with nothing. i'm trying to use the same routine, or the
    same method, to get the 'dan' out of
    "dan" dan (AT) domain (DOT) com
    but a) the regex confuses me enough to not know how to get that out of there
    (i've replaced the ,'s with "'s, but that's obviously not enough), and b) i
    don't know how to give the regex the full line of text, and assign the
    extracted value into a variable for use later on.

    i'd really like to get my head round this. if anyone's got any good guides,
    pointers, or places i can go to read and help me learn more, i'd much
    appreciate it.

    many thanks.

    dan
  • No.2 | | 3201 bytes | |

    hi

    thanks for the pointers. i think the news thing, or something, made the
    question a little unclear, as it stripped some characters from the input.
    the input is the default way email headers are sent with the "from" and "to"
    portions,
    "name" <email (AT) example (DOT) com>
    the question was pertaining to get 'name' out of "name" at the start, and
    the email address out of the <<>brackets. however i managed to figure that
    one out by playing.
    the other question i have involves searching a string. i'm a little
    unfamiliar with this process as well. i'm basically writing a script to
    automate the sending out of emails, and it checks the pop3 box every so
    often and sends out emails depending on what's sat in there, then deletes
    them. in this case it's paypal emails. messages paypal send to say i've
    received a payment. this needs to be expanded further than just picking
    particular words out of the entire subject line as this may not always be
    the case. say for example, the subject said:
    item no.00001313131 - notification of an instant payment received from
    ebay_user (ebayuser (AT) example (DOT) com)
    i can if ($subject =~ /$item_number/) to pick out the item number from that,
    but i need to search for the email address and the username in the subject.
    i suppose the easiest way to do this would be to search the string for
    "(wildcard-here)" and take one off the index of that to find the username in
    question (as this is always going to change). is there any easy way to do
    this with regex or otherwise?

    i hope this makes it a little clearer!

    thanks, dan

    ""Dr.Ruud"" <rvtol+news (AT) isolution (DOT) nlwrote in message
    @lists.develooper.com
    "Dan" schreef:
    >
    >if (substr($line,0,5) eq "From:")
    >

    You don't even need to know that 'From:' is 5 characters, if you use

    if ( 0 == index $line, 'From:' )
    >
    >
    >i wrote a program some months back which utilised a compelx regex sub
    >>

    >$onchan{lc($data[0])} =~ s/(,|^)\Q$data[1]\E(?=,|$)//;
    >>

    >which substitutes the exact match for $data[1] in a long string which
    >is csv, and replace it with nothing. i'm trying to use the same
    >routine, or the same method, to get the 'dan' out of
    >"dan" dan[@]domain.com
    >but a) the regex confuses me enough to not know how to get that out
    >of there (i've replaced the ,'s with "'s, but that's obviously not
    >enough), and b) i don't know how to give the regex the full line of
    >text, and assign the extracted value into a variable for use later on.
    >

    Please give better examples of input and expected output. Use
    "example.com" in examples.
    --
    This contains a regexp that will remove the non-@ characters from the
    start of a string:

    echo 'dan (AT) example (DOT) com' | perl -pe 's/^[^@]+//'

    See also: perldoc -q strip
  • No.3 | | 2089 bytes | |

    5/7/06, Dan <dan (AT) danneh (DOT) orgwrote:
    i been doing perl long enough that you'd think i should know this, but one
    thing i've never ever ever managed to get my head around is how regex works.

    i'm using net::pop3 (mail::pop3client doesn't work!), and i'm trying to
    extract certain data from the pop3 stream (from, subject, and some of the
    body eventually). but the regex behind matching the line required is just
    baffling me. i could cheat and say
    if (substr($line,0,5) eq "From:") { }, but i want to get past the phase of
    doing things the such a long way round, and learn more about regex.

    i wrote a program some months back which utilised a compelx regex sub
    $onchan{lc($data[0])} =~ s/(,|^)\Q$data[1]\E(?=,|$)//;
    which substitutes the exact match for $data[1] in a long string which is
    csv, and replace it with nothing. i'm trying to use the same routine, or the
    same method, to get the 'dan' out of
    "dan" dan (AT) domain (DOT) com
    but a) the regex confuses me enough to not know how to get that out of there
    (i've replaced the ,'s with "'s, but that's obviously not enough), and b) i
    don't know how to give the regex the full line of text, and assign the
    extracted value into a variable for use later on.

    i'd really like to get my head round this. if anyone's got any good guides,
    pointers, or places i can go to read and help me learn more, i'd much
    appreciate it.

    many thanks.

    dan

    Not to clear on what this means:
    >to get the 'dan' out of "dan" dan (AT) domain (DOT) com


    But here's a stab

    my $string = 'dan (AT) domain (DOT) com';
    $string =~ s/^(.*?)@//;
    This would actualy return "@domain.com";

    If you want the username, then:
    my ($username) = $string =~ m/^(.*?)@/;

    if (substr($line,0,5) eq "From:") { }, but i want to get past the phase of
    if ($line =~ m/^From:/i) { do something with "From" line; }
  • No.4 | | 1810 bytes | |

    Dan wrote:
    i been doing perl long enough that you'd think i should know this, but one
    thing i've never ever ever managed to get my head around is how regex works.

    i'm using net::pop3 (mail::pop3client doesn't work!), and i'm trying to
    extract certain data from the pop3 stream (from, subject, and some of the
    body eventually). but the regex behind matching the line required is just
    baffling me. i could cheat and say
    if (substr($line,0,5) eq "From:") { }, but i want to get past the phase of
    doing things the such a long way round, and learn more about regex.

    i wrote a program some months back which utilised a compelx regex sub
    $onchan{lc($data[0])} =~ s/(,|^)\Q$data[1]\E(?=,|$)//;
    which substitutes the exact match for $data[1] in a long string which is
    csv, and replace it with nothing. i'm trying to use the same routine, or the
    same method, to get the 'dan' out of
    "dan" dan (AT) domain (DOT) com
    but a) the regex confuses me enough to not know how to get that out of there
    (i've replaced the ,'s with "'s, but that's obviously not enough), and b) i
    don't know how to give the regex the full line of text, and assign the
    extracted value into a variable for use later on.

    i'd really like to get my head round this. if anyone's got any good guides,
    pointers, or places i can go to read and help me learn more, i'd much
    appreciate it.

    many thanks.

    dan

    The perldocs are an excellent source for learning regex:
    perlrequick
    perlretut
    perlre
    perlreref

    I've also found a nifty little program that will help you test out your
    regex to see if it works:
    http://weitz.de/regex-coach/

    SkyBlueshoes
  • No.5 | | 855 bytes | |

    "Dan" schreef:

    Please don't top-post.

    "name" <email (AT) example (DOT) com>
    the question was pertaining to get 'name' out of "name" at the start,
    and the email address out of the <<>brackets. however i managed to
    figure that one out by playing.

    Don't play but use Regexp::Common::Email::Address

    say for example, the subject said:
    item no.00001313131 - notification of an instant payment received
    from ebay_user (ebayuser (AT) example (DOT) com)

    Use something like

    if ( /^Subject:\sitem no[.](\d+)\s-\s
    notification\sof\san\sinstant\s
    payment\sreceived\sfrom\s
    (.*?)\s
    [(](.+?[@].+?)[)]$/x )
    {
    print "item:$1, name:$2, address:$3\n";
    }

    (untested)

    If the length of some of the \s (whitespace) can be >1, make those \s+.

Re: regex..gah


max 4000 letters.
Your nickname that display:
In order to stop the spam: 0 + 9 =
QUESTION ON "Perl"

EMSDN.COM