Watch Sherlock, Subtitles not working.
Watch EastEnders, Subtitles working.
Reason: Different XML File Formats
Sherlock:
http://www.bbc.co.uk/iplayer/subtitles/ng/modav/p01n6lw7_b03nh9m6_1388621734545.
xml
EastEnders:
http://www.bbc.co.uk/iplayer/subtitles/ng/modav/p01n6lw7_b03nh9m6_1388621734545.
xml
Solution (Diff):
747a748,752
> import xml.etree.ElementTree as ET
> root = ET.fromstring(txt)
> body = root.find("{http://www.w3.org/2006/10/ttaf1}body")
> div = body.find("{http://www.w3.org/2006/10/ttaf1}div")
>
753c758
< for line in txt.split('\n'):
---
> for line in div.findall("{http://www.w3.org/2006/10/ttaf1}p"):
755,758c760,783
< m = p.match(line)
< if m:
< start_mil = "%s000" % m.group(2) # pad out to ensure 3 digits
< end_mil = "%s000" % m.group(4)
---
> m = line
>
> if m is not None:
> xml_begin = m.get('begin')
> xml_begin_sec = xml_begin
> xml_begin_mil = ""
>
> if '.' in xml_begin:
> xml_begin_sec = xml_begin.rsplit('.', 1)[0]
> xml_begin_mil = xml_begin.rsplit('.', 1)[1]
>
> xml_end = m.get('end')
> xml_end_sec = xml_end
> xml_end_mil = ""
>
> if '.' in xml_end:
> xml_end_sec = xml_end.rsplit('.', 1)[0]
> xml_end_mil = xml_end.rsplit('.', 1)[1]
>
> start_mil = "%s000" % xml_begin_mil # pad out to ensure 3 digits
> end_mil = "%s000" % xml_end_mil
> subtitle = ""
>
> if m.text is not None: subtitle = m.text
760c785
< ma = {'start' : m.group(1),
---
> ma = {'start' : xml_begin_sec,
762c787
< 'end' : m.group(3),
---
> 'end' : xml_end_sec,
764c789
< 'text' : m.group(5)}
---
> 'text' : subtitle}
Original issue reported on code.google.com by
m...@alexrieger.deon 5 Jan 2014 at 5:03