Hi!
Give a look at
BeautifulSoup is a python module designed for parsing html
Carlo
what is ITER? www.iter.org
>>
>>First, excuse me my English English is not my native
>>language, but
>>I hope
>>that I will be able to describe my problem.
>>
>>I am new in python for web, but I want to do such thing:
>>
>>Suppose I have a html-page, like this:
>>"""
>><title>TITLE</title>
>><body>
>>body_1
>><h1>1_1</h1>
>><h2>2_1</h2>
>><div id=one>div_one_1</div>
>><p>p_1</p>
>><p>p_2</p>
>><div id=one>div_one_2</div>
>><span class=sp_1>
>>sp_text
>><div id=one>div_one_2</div>
>><div id=one>div_one_3</div>
>></span>
>><h3>3_1</h3>
>><h2>2_2</h2>
>><p>p_3</p>
>>body_2
>><h1>END</h1>
>><table>
>><tr><td>td_1</td>
>><td class=sp_2>td_2</td>
>><td>td_3</td>
>><td>td_4</td></tr>
>>
>></body>
>>
>>"""
>>
>>I want to get all info from this html in a dictionary
that
>>looks like
>>this:
>>
>>rezult = [{'title':['TITLE'],
>>{'body':['body_1', 'body_2']},
>>{'h1':['1_1', 'END']},
>>{'h2':['2_1', '2_2']},
>>{'h3':['3_1']},
>>{'p':['p_1', 'p_2']},
>>{'id_one':['div_one_1', 'div_one_2', 'div_one_3']},
>>{'span_sp_1':['sp_text']},
>>{'td':['td_1', 'td_3', 'td_4']},
>>{'td_sp_2':['td_2']},
>>
>>]
>>
>>Huh, hope you understand what I need.
>>Can you advise me what approaches exist to solve tasks
of such
>>type
>>and
>>may be show some practical examples
>>Thanks in advance for help of all kind
>>
>>
>>
>>Try ElementTree or Amara.
>>
>>
>>
>>If you only cared about contents, BeautifulSoup is the answer.
>>
>>Ismael
>>
>>Tutor maillist - Tutor (AT) python (DOT) org
>>
>>
Tutor maillist - Tutor (AT) python (DOT) org