Wiki Navigation

Loading...

<?xml version="1.0" encoding="utf-8"?>

The XML declaration. It defines the XML version (1.0) and the encoding used.

<Grabber>

<Info treatErrorAsWarning="" language="" availableDays="" timezone="" version="" />

Attribute	Type	Example	Description
treatErrorAsWarning	Boolean	true	If WebEPG runs into an error (for example if there's no programs on a channel for a day) it will continue grabbing the next day if this value is set to true. WebEPG will stop processing the current channel and continue with the next if this value is set to false.
language	String	ru
availableDays	Integer	14	The number of days of TV listings available on the site. For example if availableDays="7" but the user set Grab Days = 14 in the WebEPG plugin, only 7 days would be grabbed.
timezone	String	GMT Standard Time	The name of the time zone for which the listings are provided.
version	String	2.0

<Channels>

All channels for this site is listed in child elements.

<Channel id="" siteId="" />

The information for each channel

Attribute	Type	Example	Description
Required
id	String	svt1@svt.se	Should match a channel id configured in channels.xml. If different grabbers are getting EPG data for the same channel they should use the same Channel id.
siteId	String	SVT1	The identifier for the channel on the site. This will be used by the [ID] variable in the Site element to construct the URL used to download EPG data.

</Channels>

End of Channels section.

<Listing type="">

Attribute	Value(s)	Description
Required
type	Html, Xml, JSON, Data	The type of the listing format of the target EPG data. Would normally be set to Html to grab EPG listings from a website.

<Site url="" post="" external="" encoding="" delay="" user-agent=""/>

Attribute	Type	Example	Description
Required
url	URL	http://tvguide.com/\[ID\]/\[YYYY\]-\[MM\]-\[DD\]	The URL with variables to be used when grabbing EPG data for the different channels and days. See table below for explanation of the variables. Note: Replace all ampersands (&) in the URL with &
external	Boolean	false	Use external browser (IE) for downloading page data. Will load certain Javascript sections.
Optional
post
encoding	String	utf-8	Normally auto-detected. If special characters looks wrong in your EPG Guide, try setting the correct encoding. For example ISO-8859-1 or UTF-8.
delay	Integer	1000	Time in milliseconds to wait between each HTTP request. Might be useful if the website stops responding when too many requests are made in a short period of time.
user-agent	String	Mozilla/5.0 (Windows NT 6.2; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0	WebEPG will connect to the target website and since it isn't a proper web browser some sites may reject the connection. You can use the User-Agent string to simulate a web user in each HTTP request. If not specified, the default will be used. Default is Mozilla/4.0 (compatible; MSIE 6.0; WindowsNT 5.0; .NET CLR 1 .1.4322)

The url and post attributes can use variables which will be used during grabbing to construct different URL:s for the different channels and days.

Tag	Description
[ID]	Site Channel ID - from the ChannelList section
[LIST_OFFSET]	Offset position in a list longer than one page. Starts at 0 and is the MaxCount for the next page. MaxCount is added for each page after used together with MaxCount. If number of listings on a page is less than MaxCount it stops looking for more pages.
[PAGE_OFFSET]	Same as LIST_OFFSET but only 1 is added for each new page and not MaxCount.
[DAY_OFFSET]	Offset of the day from today (0). Use startOffset attribute of the Search element to change the start.
[YYYY]	Year
[MONTH]	Month full name (e.g. January)
[MM]	Month with leading 0
[_M]	Month without leading 0
[WEEKDAY]	Day of week full name (ie Monday). Weekday names can be changed by including a WeekDayNames section in the Search element.
[DAY_OF_WEEK]	Day of week as a number. 0 = Sunday, 6 = Saturday. Specifying startOffset attribute in the Search element will shift the first day of the week by the same amount of days. E.g. when startOffset is set to 2: 0 = Friday, 6 = Thursday.
[DD]	Day with leading 0
[_D]	Day without leading 0
[EPOCH_TIME]	Number of seconds since 1/1/1970 8:00:00 AM
[EPOCH_DATE]	Number of days since 1/1/1970 8:00:00 AM
[DAY_NAME]	A string for the name. For example: today, tomorrow, etc. Requires DayNames child element of Search.

<Search startOffset="" maxlistings="" listStart="" startPage="" endPage="" language="" weekday="" />

Attribute	Value(s)	Description
Optional
startOffset	Integer	Used for configuration of [DAY_OFFSET] and [DAY_OF_WEEK].
maxlistings	Integer	Used for configuration of [LIST_OFFSET] and [PAGE_OFFSET].
listStart	Integer	Used for configuration of [LIST_OFFSET].
startPage	Integer	Used for configuration of [PAGE_OFFSET].
endPage	Integer	Used for configuration of [PAGE_OFFSET].
language	String	Language to use for [WEEKDAY]. Must be a specific country/language not a neutral language group. For example "es-ES" not just "es".
weekday	dddd, ddd	Format for weekday (long or short).

<DayNames>

<Day>value</Day>

The name of the day to be used with [DAY_NAME] tag in URL.

</DayNames>

End of DayNames.

<WeekDayNames>

Optional section to redefine weekday names. If present these will be used instead of the weekday format specified above.

<WeekDay>value</WeekDay>

The name of each day to be used with [WEEKDAY] tag in URL. The first day is by default Sunday, but can be shifted by setting start startOffset. Increasing startOffset will shift days backwards. E.g. when startOffset = 1, first day is Saturday.

</WeekDayNames>

End of weekday names section.

</Search>

End of Search element.

<Html>

Must match listing type. Child elements contains tools for parsing HTML web pages.

<Template name="" start="" end="">

Every grabber must have at least one Template element. It contains the TemplateText element which will be used to match fields, such as #START and #TITLE, on the target web site. It's possible to include different templates for use on different target web pages. For example #TITLE could be matched on the main web page and #DESCRIPTION from a subpage.

Attribute	Type	Example	Description
Required
name	String	default	The template name. A template named "default" is required.
Optional
start	String	<!-- Program -->	A string to search for which signifies the start of the listing area. E.g. a heading or the class used by the list. Everything before this string will be ignored.
end	String	<!-- End Content -->	A string to search for which signifies the end of the listing area. Everything after this string will be ignored.

<SectionTemplate tags="">

Attribute	Type	Example	Description
Required
tags	String	TSD	The first letter of each HTML tag to be used for matching. Letters must be in upper case. Multiple tags are given in a string. All other HTML tags can be ignored when creating the Template Text.

Some common tags:

Letter	Tag(s)
T	All table tags <table>, <tr>, <td>, <th>, etc
D	<div>
S	<span>
P	<p>
H	<h1>, <h2>, etc
I	<img>
A	<a>

Although the first letter is not unique for every different HTML tag, it is generally good enough to build a unique template for finding data on the page.

<TemplateText>

The template is the HTML tags and data fields that make up the program listing. It can be made up of any HTML tags, however, ONLY those listed in the tags attribute of the SectionTemplate will be used for matching. The others will be ignored. Only the element name of tags are used for matching! Attributes are ignored. For example template "<SPAN class="class1">" will match any <SPAN> tag, not only those with class="class1". However it is useful to write more self descriptive template text, not only the shortest possible.
The template special tags are used by WebEPG to locate the required data.

See WebEPG Template for detailed information on how to create the TemplateText.

Tag	Description
Required
#START or #STARTXMLTV	Program start time. Possible START time formats: * hh:MM am/pm * HH:MM * HH.MM * HHhMM STARTXMLTV format: 20080113011500
#TITLE	Program title.
Optional
#END	Program end time.
#ENDXMLTV	Program end time in XMLTV format.
#DESCRIPTION	Program description text.
#DAY	Program day (required if not part of page look up).
#MONTH
#SUBTITLE	Program or series episode name.
#GENRE	Program genre.
#EPISODE	Episode number.
#SEASON	Season number.
#ACTORS	List of actors.
*MATCH	Dynamic tag used by MatchList to find a text string in the HTML code.
*VALUE	Dynamic tag used in combination with *MATCH to store the matched text in the field specified in the Match element.
Z	Used to make a template for a variable structure and deal with optional information. See Dynamic Templates for more information.

</TemplateText>

End of the TemplateText.

<MatchList>

List of Match elements used together with the *MATCH and *VALUE dynamic template tags to grab text from HTML code where the normal tags can't be used. See Dynamic Templates for more information.

<Match field="" match="" />

Attribute	Type	Example	Description
Required
field	#FIELD	#ACTORS	The matched text of the *VALUE dynamic tag will be stored in this field.
match	String	Cast:	This string is used by the MATCH dynamic tag to search for text in a relative position to the VALUE tag.

</MatchList>

End of MatchList

</SectionTemplate>

End of the SectionTemplate

</Template>

End of the Template

<DataPreference>

<Preference template="" title="" subtitle="" genre="" description="" />

Attribute	Value(s)	Description
Required
template	template name	The name of the template
title	0-3	Preference of this value
subtitle	0-3	Preference of this value
genre	0-3	Preference of this value
description	0-3	Preference of this value

</DataPreference>

<Sublinks>

Sublinks are linked pages that contain extra data, that may not be provided on the main listing page. Optional.

<Sublink search="" template="">

Attribute	Value(s)	Example	Description
Required
search	Search string	/guiden/expand	String to identify the correct <A href> tag for this sublink. Only part of the target link needs to be specified. WebEPG will automatically use the entire href attribute to download the subpage.
template	Template name	details	Name of the template to use for this sublink. Must match a template name.

<Link url="" post="" external="" encoding="" user-agent="" delay=""/>

Optional. Only required if URL cannot be built from the main site URL. See Site url for details.

Attribute	Value(s)	Example	Description
url	URL	http://www.sol.no/guiden/expand.cgi?\[1\]	[1] can be used to match unique parts of the link, such as an ID for the show.
post
external	Boolean	false	Use external browser (IE) for downloading page data. Will load certain Javascript sections.
user-agent	String	Mozilla/5.0 (Windows NT 6.2; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0	WebEPG will connect to the target website and since it isn't a proper web browser some sites may reject the connection. You can use the User-Agent string to simulate a web user in each HTTP request. If not specified the default will be used. Note that if you specify user-agent in the Site element it will NOT be propagated here.
delay	Integer	500	Time in milliseconds to wait between each HTTP request. Might be useful if the website stops responding when too many requests are made in a short period of time.

Example

To match the following HTML code:

<a href="http://www.tvtoday.de/programm/detail/?sid=107036189606&format=detail>

we may use an url attribute like:

http://www.tvtoday.de/programm/detail/?sid=[1]&amp;format=detail

</Sublink>

End of this Sublink.

</Sublinks>

End of the Sublinks section.

<Searches>

<Search match="" field="" remove="" />

Attribute	Value(s)	Description
Required
match	regex search	regex to find data
field	#Field name	Name of the field used to store the data
remove	true/false	Remove data from store. Stops data being added to other fields

This command searches the whole section of source page matching the template (all tags, their attributes and values). It finds the value corresponding to given regular expression match and pastes it to given field. If remove is set to true, the whole text corresponding to regular expression match will be cut out of the source page, so it will not be part of output from template parsing. It can be also used to remove undesired parts of descriptions, titles etc. More than 1 search can be used, however only latest match will be used.

Same fields as for TemplateText are allowed.

Example:

<Search match="\([0-9]{1,3}[,][0-9]{0,3}\)" field="#EPISODE" remove="true" />
<Search match="\([0-9]{1,3}\)" field="#EPISODE" remove="true" />
<Search match="\([0-9]{1,3}[/][0-9]{0,3}\)" field="#EPISODE" remove="true" />

This complex search searches for episode number in any of the form (N), (N1, N2) or (N/Count). Episode number will be removed.

</Searches>

End of Searches section

<DateTime>

<Month>value</Month>

Used for matching <#MONTH> tag in template. Only required if <#MONTH> tag is use in a template.

Value is the tet as found on the site. There must be 12 months in the correct order (Jan-Dec).

</DateTime>

End of DateTime

</Html>

End of the Html section

<Xml channel="" xpath="">

Must match listing type. Child elements contains tools for parsing Xml results.

Tag	Description
Required
channel	Filter to apply to the list of EPG-elements
xpath	Xpath expression which returns the EPG-elements

<TemplateText>

The template is the HTML tags and data fields that make up the program listing. It can be made up of any HTML tags, however, ONLY those listed in the tags attribute of the SectionTemplate will be used for matching. The others will be ignored. Only the element name of tags are used for matching! Attributes are ignored. For example template "<SPAN class="class1">" will match any <SPAN> tag, not only those with class="class1". However it is useful to write more self descriptive template text, not only the shortest possible.
The template special tags are used by WebEPG to locate the required data.

See WebEPG Template for detailed information on how to create the TemplateText.

Tag	Description
Required
#START or #STARTXMLTV	Program start time. Possible START time formats: * hh:MM am/pm * HH:MM * HH.MM * HHhMM STARTXMLTV format: 20080113011500
#TITLE	Program title.
Optional
#END	Program end time.
#ENDXMLTV	Program end time in XMLTV format.
#DESCRIPTION	Program description text.
#DAY	Program day (required if not part of page look up).
#MONTH
#SUBTITLE	Program or series episode name.
#GENRE	Program genre.
#EPISODE	Episode number.
#SEASON	Season number.
#ACTORS	List of actors.

</TemplateText>

End of the TemplateText.

<Fields>

Mapping from xml-attributes to EPG-fields

This contains a number of "Field" nodes, with these attributes:

<Field name="" xmlname="" />

Tag	Description
Required
name	EPG-Field (f.e. #START or #TITLE)
xmlname	Name of xml-attribute

</Fields>

End of the Fields

Example:

<Xml channel="id=28" xpath="airing">
    <Fields>
        <Field name="#START" xmlname="air_time" />
        <Field name="#TITLE" xmlname="title" />
        <Field name="#DESCRIPTION" xmlname="description" />
    </Fields>
</Xml>

</Xml>

End of Xml section

<JSON channel="" xpath="">

To be able to parse JSON data returned from the web source.

Tag	Description
Required
channel	Filter to apply to the list of EPG-elements
xpath	Xpath expression which returns the EPG-elements. Note that this is a custom implementation of Xpath, and not all possibilities are supported (yet).

<Fields>

Mapping from JSON-attributes to EPG-fields. This contains a number of "Field" nodes, with these attributes:

<Field name="" jsonname="" />

Tag	Description
Required
name	EPG-Field (f.e. #START or #TITLE)
jsonname	JSON-attribute

</Fields>

End of the Fields

Example:

<JSON channel="channel/id=28" xpath="airing">
    <Fields>
        <Field name="#START" jsonname="air_time" />
        <Field name="#TITLE" jsonname="title" />
        <Field name="#DESCRIPTION" jsonname="episode/original_title" />
    </Fields>
</JSON>

</JSON>

End of JSON section

<Data>

Must match listing type.

</Listing>

End of Listing section

<Actions>

<Modify channel="" field="" search="" action="">value</Modify>

Attribute	Value(s)	Description
Required
channel	* or channel id	The channel on which the modify will be performed. (* = all channels)
field	field to modify
search	search string
action	Replace/Remove
value	string to replace	Only required for Replace action.

</Actions>

End of the Actions section

</Grabber>

End of the grabber config

Further Information

See also: Time Zones

Changelog

Change	Date	Release
WebEPG Grabber	2013/11/01	1.6.0

Seitenhierarchie

WebEPG Grabber

<?xml version="1.0" encoding="utf-8"?>

<Grabber>

<Info treatErrorAsWarning="" language="" availableDays="" timezone="" version="" />

<Channels>

<Channel id="" siteId="" />

<Listing type="">

<Site url="" post="" external="" encoding="" delay="" user-agent=""/>

<Search startOffset="" maxlistings="" listStart="" startPage="" endPage="" language="" weekday="" />

<DayNames>

<Day>value</Day>

<WeekDayNames>

<WeekDay>value</WeekDay>

<Html>

<Template name="" start="" end="">

<SectionTemplate tags="">

<TemplateText>

<MatchList>

<Match field="" match="" />

<DataPreference>

<Preference template="" title="" subtitle="" genre="" description="" />

<Sublinks>

<Sublink search="" template="">

<Link url="" post="" external="" encoding="" user-agent="" delay=""/>

<Searches>

<Search match="" field="" remove="" />

</Searches>

<DateTime>

<Month>value</Month>

</DateTime>

</Html>

<Xml channel="" xpath="">

<TemplateText>

<Fields>

<Field name="" xmlname="" />

</Xml>

<JSON channel="" xpath="">

<Fields>

<Field name="" jsonname="" />

</JSON>

<Data>

</Listing>

<Actions>

</Actions>

</Grabber>

Further Information

Changelog

2 Kommentare

Team-MediaPortal sagt:

Team-MediaPortal sagt:

About The Project

Quick Navigation

Support MediaPortal!