Skip to content

Subtitles not working // different souce format (xml) #132

@GoogleCodeExporter

Description

@GoogleCodeExporter
Watch Sherlock, Subtitles not working.
Watch EastEnders, Subtitles working.

Reason: Different XML File Formats
Sherlock: 
http://www.bbc.co.uk/iplayer/subtitles/ng/modav/p01n6lw7_b03nh9m6_1388621734545.
xml
EastEnders: 
http://www.bbc.co.uk/iplayer/subtitles/ng/modav/p01n6lw7_b03nh9m6_1388621734545.
xml

Solution (Diff):
747a748,752
>     import xml.etree.ElementTree as ET
>     root = ET.fromstring(txt)
>     body = root.find("{http://www.w3.org/2006/10/ttaf1}body")
>     div = body.find("{http://www.w3.org/2006/10/ttaf1}div")
> 
753c758
<     for line in txt.split('\n'):

---
>     for line in div.findall("{http://www.w3.org/2006/10/ttaf1}p"):
755,758c760,783
<         m = p.match(line)
<         if m:
<             start_mil = "%s000" % m.group(2) # pad out to ensure 3 digits
<             end_mil   = "%s000" % m.group(4)

---
>         m = line
> 
>         if m is not None:
>             xml_begin = m.get('begin')
>             xml_begin_sec = xml_begin
>             xml_begin_mil = ""
>             
>             if '.' in xml_begin:
>                 xml_begin_sec = xml_begin.rsplit('.', 1)[0]
>                 xml_begin_mil = xml_begin.rsplit('.', 1)[1]
> 
>             xml_end = m.get('end')
>             xml_end_sec = xml_end
>             xml_end_mil = ""
>             
>             if '.' in xml_end:
>                 xml_end_sec = xml_end.rsplit('.', 1)[0]
>                 xml_end_mil = xml_end.rsplit('.', 1)[1]
> 
>             start_mil = "%s000" % xml_begin_mil # pad out to ensure 3 digits
>             end_mil   = "%s000" % xml_end_mil
>             subtitle = ""
> 
>             if m.text is not None: subtitle = m.text
760c785
<             ma = {'start'     : m.group(1),

---
>             ma = {'start'     : xml_begin_sec,
762c787
<                   'end'       : m.group(3),

---
>                   'end'       : xml_end_sec,
764c789
<                   'text'      : m.group(5)}

---
>                   'text'      : subtitle}

Original issue reported on code.google.com by m...@alexrieger.de on 5 Jan 2014 at 5:03

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions