Overview

About

Newznab+ is a PHP/Smarty application, which supports the indexing of usenet headers into a MySQL database and provides a simple web based search interface onto the data.

It includes simple CMS facilities, SEO friendly URLs and is designed with the intention of allowing users to create a community around their index.

For information on how to install, please refer to INSTALL.txt

To discuss visit us on IRC at irc.synirc.net #newznab, or use the web client.

Newznab+ is licensed under a non redistributable license. For details, please refer to LICENCE.txt. An open source (GPLv3) edition of Newznab is freely available for those wishing to create derivative works.

How it Works

Usenet groups are specified, message headers (binaries and parts) are downloaded for the groups which match regex, releases are created from completed sets of binaries by applying regex to the message subject. Releases are categorised by regexing the message subject. Metadata from TvRage, TMDb, Rotten Tomatoes, IMDb and Amazon are applied to each created release.

Choosing Newsgroups

Groups can be manually entered if you know the name. Groups can also be bulk added when specified as a regular expression. For example if you want to index the groups alt.bin.blah.* and alt.bin.other use the value alt.bin.blah.*|alt.bin.other.

Updating Index (populating binaries + parts)

The recommended way to schedule updates is via the dos and unix start scripts in /path/to/newznab/misc/update_scripts/. Make sure you set the paths correctly for your installation. If you are running on unix environment, there is an experimental multithreaded update binaries script named update_binaries_threaded.php and backfill_threaded.php. You can alter the number of threads used by editing the maxChildren setting. For more information on updating and backfilling, see the docs for update scripts.

Categorization

Most categorization of releases is done at the time of applying the regex. However if no category is supplied for a regex then \www\lib\category.php contains the logic which attempts to map a release to a site category. Site categories are used to make browsing NZBs easier. Add new categories by updating the category table, and adding a new Category constant, then map it in the function determineCategory.

Missing Parts

When headers are requested from the usenet provider, they are asked for in number ranges e.g. 1-1000, 1001-2000 etc. For various reasons sometimes the provider does not return a header, this is not always because the header does not exist, there may be some synchronization going on at the providers end. If a header is requested but not returned, we store a record of this in the table partrepair. Each time update_binaries is run an attempt is made to go back and get the missing parts. If after five attempts the parts can still not be obtained, Newznab gives up. When update_releases runs, if a release is seen to have missing parts it will not be released until four hours after it was uploaded to usenet. this is so a chance has been made to repair all missing parts. After four hours a release will be created anyway and its down to the quality of the PAR files to determine whether a release can be correctly unpacked.

Backfilling Groups

Since most usenet providers have 800+ days of retention indexing all that information in one shot is not practical. Newznab provides a backfill feature that allow you to index past articles once your initial index has been built. To use the feature first set the back fill days setting in the group(s) to be backfilled to the number of day you wish to go back, making sure to set it higher than the number of days listed in the first post column. Once set run the backfill.php script in misc/update_scripts. Groups can be backfilled to a particular date using the script misc/update_scripts/backfill_date.php using the syntax:

php backfill_date.php 2011-05-15 alt.binaries.groupname.here

You can use the _threaded version of this script if on linux.

For more information on backfilling, see Update Scripts.

Regex Matching

Releases are created by applying regexs to binary message subjects. Different regexs are applied to binaries from different newsgroups. Catchall regexs are applied to any binaries left unmatched after the group specific matching. A category can be associated with a regex, which will allow the processing of groups like a.b.inner-sanctum which contain a combination of different binary types.

Regex Details

Regexes are used to parse the subject header to create definable release names. There are two named capturing groups used for this, ‘name’ and ‘parts’.

The (?P<name>) capturing group is used to define the final release name as well as the text that the binaries are grouped on. It is required to use this named capturing group in the regex.

The (?P<parts>) capturing group defines the total number of parts needed in order to make a release. Most posters include the total number of binaries in the subject header however some do not. When the (?P<parts>) capturing group is omitted from the regex, newznab will wait 4 hours after the postdate of the last binary before making them into a release to ensure the final release is complete. This capturing group is optional.

Regex Updating

Regexs in the system with IDs in the range 0-10000 are system defined and are updated centrally and are retrieved from Newznab’s server. Every time processReleases is run, a check will be performed to see if you have the latest regexs. If you do not want this check to be made then set site.latestregexurl to null.

NZB File Storage

NZBs are saved to disk, in a compressed gzip format, at the location specified by site.nzbpath in subdirectories based on the first character of the release guid; this just makes the directories a bit easier to manage when you have thousands of nzb.gz files. The default path is /website/../nzbfiles.

Spotnab

Spotnab allows systems to share information between each other based on discovery and approval of found sources. The implementation is based loosely on how spotweb works. With Spotnab you can fetch comments (and potentially other information) from other newznab sources and apply them to your own comment section. Fetched content is scanned and decrypted using a password that the newznab server who posted it chooses to share with you. Since all newznab servers choose to share or not, Spotnab will only populate your database with comments from the sources who choose to share they’re comments with you.

Additionally you can post comments; send all the comments made by your local users of your site to usenet encrypted by your own secure private key. Only those you share your public key will be allowed to decrypt it for their own server.

SSL Usenet Connection

Install the OpenSSL extension, set in config.php:

define ('NNTP_SSLENABLED', true);

Importing & Exporting NZBs

NZB files can be imported from the admin interface (or cli). Importing is a convenient way to fill the index without trawling a large backdated number of usenet messages. After running an import the processReleases function must be run to post process releases. NZBs can also be exported based on system categories.

Import script lives in /misc/update_scripts/import.php Usage: php import.php [path(string)] [usefilename(true/false)] [dupecheck(true/false)] [movefiles(true/false)] [overridecategory(number)]

Admin

Admin functions all live under the URL /admin/ which is only accessible by users with admin role. Set users.role to be 2 on the users you wish to be admins.

TvRage/TVDB

After processReleases is called, an attempt is made to determine the TvRage IDs for every release which looks like its TV. This also works out the series/episode columns. The data in the TvRage table will become populated from best guesses from the TvRage search API. If some of these guesses are wrong, you can manually edit the TvRage data in the admin interface and use the remove link to wipe any releases which have that rageid and then manually call ‘process tv’ which will attempt to relink rage data. When a new release is created it goes in with release.rageid = -1 when TV is processed, the rageid either goes to the best guess, or to -2, which indicates that either no match could be made or the release isn’t perceived to be TV.

NFO

NFOs are attempted to be retrieved using a queuing method. There will be a number of attempts to get an NFO before giving up.

Caching

Caching of queries results to aid performance and is supported by using memcache or file. In the config.php file edit the CACHEOPT_METHOD constant to either memcache or file. You can additionally configure the memcache server/port address. There is a default caching TTL of 15 minutes, which when enabled, is applied to queries in the main browsing lists.

IMDb, TMDb and Rotten Tomatoes

If enabled, and if an IMDb ID is found in the NFO, the application will attempt to use that IMDb ID to get general data about the movie (title, year, genre, covers, etc.) from themoviedb.org. If no entry is available from TMDb then an attempt to gather the info from imdb.com is made. Any results are stored in the moveinfo table, with covers/backdrops being saved to the images/covers/ directory.

3rd Party API Keys

In order to do lookups to TMDb, Rotten Tomatoes and Amazon, API keys are used. Newznab ships with some default keys, but due to the restrictions on use of APIs, it is strongly suggested you go and get your own API keys for each service and save them using the site edit page.

Content/CMS

Pages can be added to the site with SEO friendly URLs via the /admin/.

Skinning & Themes

Avoid custom edits to code and stylesheets to make updating painless.

Override any styles by creating a folder \www\templates\<yourtheme>\. Stick any custom images, views, scripts or styles in \www\templates\<yourtheme>\images\. Then pick the theme in the admin/site-edit page. Your styles and pages will override the existing default pages.

Web API

www.sitename.com/api? provides API access to query and retrieve NZBs. Call www.sitename.com/apihelp to see help doc with all available options. Users either have to be logged in or provide their rsstoken. Users can use their rsstoken to access both rss + api. Full details of the API and how to implement it are provided in the Web API docs.

Debugging

Switch php.ini error_reporting to E_ALL and ensure logging to browser is enabled.

Development

Here is a brief overview of the location of various Newznab components. For more detailed information, see the appropriate sections in the docs.

\db\schema.sql
The latest database schema. You should be able to rerun in and create new blank schema.
\db\patch\
Database upgrade patch files. If you update from svn you will need to apply all patches since last update.
\db\cache\
If file based caching is enabled the cache objects are stored here.
\misc\
Used for general docs and useful info, nothing in here is referenced by the application.
\misc\update_scripts\
Shell, batch scripts and php files to call the updating of index from cli
\nzbfiles
Default folder for all gzipped NZBs to be stored.
\www\install\
Installer files.
\www\lib\framework
A few general classes for db/http code.
\www\lib\smarty
Copy of a fairly recent Smarty lib, used for template rendering.
\www\lib\
All classes used in the app, typically named same as its database entity
\www\covers\
All covers downloaded for releases.
\www\pages\
Controllers for every frontend page in the system.
\www\admin\
All php pages used by the admin.
\www\templates\default\views\admin
All templates used by the admin pages.
\www\templates\default\views\frontend
All templates used by the user pages.
\www\templates\<yourtheme>
Blank area for implementation specific UI customizations.
\www\templates\default\scripts\
Javscript dumping ground.
\www\templates\default\styles\
Default theme css (don’t edit, extend with your own theme).

Hall of Fame

(just some of the) people who’ve helped along the way:

iota@cyberarmy regexs, sessions
enstyne@cyberarmy regexs
fatfecker@newznab mediainfo, ffmpeg, tv
gizmore@wechall password, hash
lhbandit@nzbsorg yenc, nntp, bokko, dev
dryes@nzbsorg anidb
pleo@newznab sphinx, mobile, docs
lordgnu@newznab powerspawn, threading
bb@newznab dev
keyvan@newznab backfill
troph@bhw performance
kevin123@newznab compression,theming
wafflehouse@newznab compression,db
jayhawk@nzbsu testing, icons
midgetspy@sickbeard rage, api
dogzipp@dognzb dev
andrew@newznab inno
zdefect@newznab anidb
ueland@newznab installer
ensi@ensisoft api
hecks@tvnzb rar api
michael@newznab dev
l2g@newznab nfo,spotnab
danza@newznab dev
sakarias@newznab testing
pairdime@sabnzbd jquery, css
pmow@sabnzbd headers, backfill
poutine@newznab recaptcha
robv@newznab init.d
bigdave@newznab testing
duz@sabnzbd yenc
inpheaux@sabnzbd design, nzb
spooge@newznab testing
sy@newznab testing, regexs, amazon
magegminds@newznab lighttpd rewrite rules
trizz@newznab lighttpd rewrite rules
emanon@newznab testing
fubaarr@newznab testing
mobiKalw@newznab testing
crudehung@newznab nginx rewrite rules
f0rmed@newznab testing
frikish@github theme
www.famfamfam.com icons
wally73@newznab dev