Overview
========

About
-----

Newznab+ is a PHP/Smarty application, which supports the indexing of usenet
headers into a MySQL database and provides a simple web based search interface
onto the data.

It includes simple CMS facilities, SEO friendly URLs and is designed with the
intention of allowing users to create a community around their index.

For information on how to install, please refer to INSTALL.txt

To discuss visit us on IRC at ``irc.synirc.net #newznab``, or use the 
`web client <http://newznab.com/chat.html>`_.

Newznab+ is licensed under a non redistributable license.  For details, please
refer to :doc:`LICENCE.txt <licence>`. An open source (GPLv3) edition of
Newznab is freely available for those wishing to create derivative works.


How it Works
------------

Usenet groups are specified, message headers (binaries and parts) are
downloaded for the groups which match regex, releases are created from completed
sets of binaries by applying :term:`regex` to the message subject.  Releases are
categorised by :term:`regex`\ing the message subject.  Metadata from :term:`TvRage`,
:term:`TMDb`, :term:`Rotten Tomatoes`, :term:`IMDb` and :term:`Amazon` are applied
to each created release.


Choosing Newsgroups
-------------------

Groups can be manually entered if you know the name.  Groups can also be bulk
added when specified as a regular expression.  For example if you want to index
the groups ``alt.bin.blah.*`` and ``alt.bin.other`` use the value
``alt.bin.blah.*|alt.bin.other``.
	

Updating Index (populating binaries + parts)
--------------------------------------------

The recommended way to schedule updates is via the dos and unix start scripts
in ``/path/to/newznab/misc/update_scripts/``.  Make sure you set the paths
correctly for your installation.  If you are running on unix environment, there
is an experimental multithreaded update binaries script named
``update_binaries_threaded.php`` and ``backfill_threaded.php``.  You can alter
the number of threads used by editing the ``maxChildren`` setting.  For more
information on updating and backfilling, see the docs for
:doc:`update scripts <misc/update_scripts>`.


Categorization
--------------

Most categorization of releases is done at the time of applying the
:term:`regex`.  However if no category is supplied for a :term:`regex` then
``\www\lib\category.php`` contains the logic which attempts to map a release to
a site category.  Site categories are used to make browsing NZBs easier.  Add
new categories by updating the category table, and adding a new
:class:`Category` constant, then map it in the function
:meth:`~Category::determineCategory`.

.. _overview-missing_parts:

Missing Parts
-------------

When headers are requested from the usenet provider, they are asked for in
number ranges e.g. 1-1000, 1001-2000 etc.  For various reasons sometimes the
provider does not return a header, this is not always because the header does
not exist, there may be some synchronization going on at the providers end.  If
a header is requested but not returned, we store a record of this in the table
``partrepair``.  Each time :doc:`update_binaries <misc/update_scripts>` is run
an attempt is made to go back and get the missing parts.  If after five
attempts the parts can still not be obtained, Newznab gives up.  When
:doc:`update_releases <misc/update_scripts>` runs, if a release is seen to have
missing parts it will not be released until four hours after it was uploaded to
usenet. this is so a chance has been made to repair all missing parts.  After
four hours a release will be created anyway and its down to the quality of the
:term:`PAR` files to determine whether a release can be correctly unpacked.


Backfilling Groups
------------------

Since most usenet providers have 800+ days of retention indexing all that
information in one shot is not practical.  Newznab provides a backfill feature
that allow you to index past articles once your initial index has been built.  
To use the feature first set the back fill days setting in the group(s) to be
backfilled to the number of day you wish to go back, making sure to set it
higher than the number of days listed in the first post column.  Once set run
the ``backfill.php`` script in ``misc/update_scripts``.  Groups can be
backfilled to a particular date using the script
``misc/update_scripts/backfill_date.php`` using the syntax::

    php backfill_date.php 2011-05-15 alt.binaries.groupname.here

You can use the ``_threaded`` version of this script if on linux.

For more information on backfilling, see :doc:`misc/update_scripts`.


Regex Matching
--------------

Releases are created by applying :term:`regex`\s to binary message subjects.
Different :term:`regex`\s are applied to binaries from different newsgroups.
Catchall :term:`regex`\s are applied to any binaries left unmatched after the
group specific matching.  A category can be associated with a :term:`regex`,
which will allow the processing of groups like ``a.b.inner-sanctum`` which
contain a combination of different binary types.


Regex Details
-------------

Regexes are used to parse the subject header to create definable release names. 
There are two named capturing groups used for this, 'name' and 'parts'.

The (?P<name>) capturing group is used to define the final release name as well 
as the text that the binaries are grouped on. It is required to use this named 
capturing group in the :term:`regex`.

The (?P<parts>) capturing group defines the total number of parts needed in order 
to make a release. Most posters include the total number of binaries in the subject 
header however some do not. When the (?P<parts>) capturing group is omitted from 
the :term:`regex`, newznab will wait 4 hours after the postdate of the last binary
before making them into a release to ensure the final release is complete. 
This capturing group is optional.

	
Regex Updating
--------------

:term:`Regex`\s in the system with IDs in the range 0-10000 are system defined
and are updated centrally and are retrieved from Newznab's server.  Every time
:meth:`~Releases::processReleases` is run, a check will be performed to see
if you have the latest :term:`regex`\s.  If you do not want this check to be
made then set ``site.latestregexurl`` to ``null``.
	
	
NZB File Storage
----------------

:term:`NZB`\s are saved to disk, in a compressed gzip format, at the location
specified by ``site.nzbpath`` in subdirectories based on the first character of
the release guid; this just makes the directories a bit easier to manage when
you have thousands of ``nzb.gz`` files.  The default path is
``/website/../nzbfiles``.

	
Spotnab
--------------

Spotnab allows systems to share information between each other based on discovery
and approval of found sources. The implementation is based loosely on how spotweb 
works. With Spotnab you can fetch comments (and potentially other information) 
from other newznab sources and apply them to your own comment section. Fetched content 
is scanned and decrypted using a password that the newznab server who posted it chooses 
to share with you. Since all newznab servers choose to share or not, Spotnab will 
only populate your database with comments from the sources who choose to share they're 
comments with you.

Additionally you can post comments; send all the comments made by your local users 
of your site to usenet encrypted by your own secure private key. Only those you share 
your public key will be allowed to decrypt it for their own server.

	
SSL Usenet Connection
---------------------

Install the OpenSSL extension, set in ``config.php``::

    define ('NNTP_SSLENABLED', true);

	
Importing & Exporting NZBs
--------------------------

:term:`NZB` files can be imported from the admin interface (or
:doc:`cli <misc/update_scripts>`).  Importing is a convenient way to fill the
index without trawling a large backdated number of usenet messages.  After
running an import the :meth:`~Releases::processReleases` function must be
run to post process releases.  :term:`NZB`\s can also be exported based on
system categories.

Import script lives in /misc/update_scripts/import.php
Usage: php import.php [path(string)] [usefilename(true/false)] [dupecheck(true/false)] [movefiles(true/false)] [overridecategory(number)]
	
Google Ads/Analytics
--------------------

To integrate Google Analytics and AdSense, provide the AdSense ad module
IDs into the site table for the ``searchbox`` (bottom of menu).  Providing an
Analytics ID will include the Analytics javascript in the footer.
	
Admin
-----

Admin functions all live under the URL ``/admin/`` which is only accessible by
users with admin role.  Set ``users.role`` to be 2 on the users you wish to be
admins.


TvRage/TVDB
------

After :meth:`~Releases::processReleases` is called, an attempt is made to
determine the :term:`TvRage` IDs for every release which looks like its TV.
This also works out the series/episode columns.  The data in the :term:`TvRage`
table will become populated from best guesses from the :term:`TvRage` search
API.  If some of these guesses are wrong, you can manually edit the
:term:`TvRage` data in the admin interface and use the remove link to wipe any
releases which have that ``rageid`` and then manually call 'process tv' which
will attempt to relink rage data.  When a new release is created it goes in
with ``release.rageid = -1`` when TV is processed, the ``rageid`` either goes
to the best guess, or to -2, which indicates that either no match could be made
or the release isn't perceived to be TV.


NFO
---

:term:`NFO`\s are attempted to be retrieved using a queuing method.  There will
be a number of attempts to get an :term:`NFO` before giving up.

	
Caching
-------

Caching of queries results to aid performance and is supported by using
memcache or file.  In the ``config.php`` file edit the ``CACHEOPT_METHOD``
constant to either ``memcache`` or ``file``.  You can additionally
configure the memcache server/port address.  There is a default caching TTL of
15 minutes, which when enabled, is applied to queries in the main browsing
lists.
	

IMDb, TMDb and Rotten Tomatoes
------------------------------

If enabled, and if an :term:`IMDb` ID is found in the :term:`NFO`, the
application will attempt to use that :term:`IMDb` ID to get general data about
the movie (title, year, genre, covers, etc.) from themoviedb.org.  If no entry
is available from :term:`TMDb` then an attempt to gather the info from
``imdb.com`` is made.  Any results are stored in the ``moveinfo`` table, with
covers/backdrops being saved to the ``images/covers/`` directory.


3rd Party API Keys
------------------

In order to do lookups to :term:`TMDb`, :term:`Rotten Tomatoes` and
:term:`Amazon`, API keys are used.  Newznab ships with some default keys, but
due to the restrictions on use of APIs, it is strongly suggested you go and get
your own API keys for each service and save them using the site edit page.


Content/CMS
-----------

Pages can be added to the site with SEO friendly URLs via the ``/admin/``.


Skinning & Themes
-----------------

Avoid custom edits to code and stylesheets to make updating painless. 

Override any styles by creating a folder ``\www\templates\<yourtheme>\``.
Stick any custom images, views, scripts or styles in ``\www\templates\<yourtheme>\images\``.  
Then pick the theme in the ``admin/site-edit`` page.  Your styles and pages will override 
the existing default pages.
	
Web API
-------

www.sitename.com/api? provides API access to query and retrieve NZBs.  Call
www.sitename.com/apihelp to see help doc with all available options.  Users
either have to be  logged in or provide their ``rsstoken``.  Users can use
their ``rsstoken`` to access both rss + api.  Full details of the API and how
to implement it are provided in the :doc:`misc/api` docs.


Debugging
---------

Switch ``php.ini`` ``error_reporting`` to ``E_ALL`` and ensure logging to
browser is enabled.


Development
-----------

Here is a brief overview of the location of various Newznab components.  For
more detailed information, see the appropriate sections in the docs.

``\db\schema.sql``
    The latest database schema.  You should be able to rerun in and create new
    blank schema.

``\db\patch\``
    Database upgrade patch files.  If you update from svn you will need to
    apply all patches since last update.
    
``\db\cache\``
    If file based caching is enabled the cache objects are stored here.
    
``\misc\``
    Used for general docs and useful info, nothing in here is referenced by the
    application.
    
``\misc\update_scripts\``
    Shell, batch scripts and php files to call the updating of index from cli
    
``\nzbfiles``
    Default folder for all gzipped NZBs to be stored.

``\www\install\``
    Installer files.

``\www\lib\framework``
    A few general classes for db/http code.

``\www\lib\smarty``
    Copy of a fairly recent Smarty lib, used for template rendering.
    
``\www\lib\``
    All classes used in the app, typically named same as its database entity
    
``\www\covers\``
    All covers downloaded for releases.

``\www\pages\``
    Controllers for every frontend page in the system.

``\www\admin\``
    All php pages used by the admin.

``\www\templates\default\views\admin``
    All templates used by the admin pages.
    
``\www\templates\default\views\frontend``
    All templates used by the user pages.
    
``\www\templates\<yourtheme>``
    Blank area for implementation specific UI customizations.
    
``\www\templates\default\scripts\``
    Javscript dumping ground.

``\www\templates\default\styles\``
    Default theme css (don't edit, extend with your own theme).


Hall of Fame
------------

(just some of the) people who've helped along the way:

=========================   =============================
iota@cyberarmy      		regexs, sessions
enstyne@cyberarmy   		regexs
fatfecker@newznab			mediainfo, ffmpeg, tv
gizmore@wechall         	password, hash
lhbandit@nzbsorg           	yenc, nntp, bokko, dev	
dryes@nzbsorg              	anidb	
pleo@newznab                sphinx, mobile, docs
lordgnu@newznab             powerspawn, threading
bb@newznab                  dev
keyvan@newznab				backfill
troph@bhw                   performance
kevin123@newznab			compression,theming
wafflehouse@newznab			compression,db
jayhawk@nzbsu              	testing, icons	
midgetspy@sickbeard         rage, api
dogzipp@dognzb              dev
andrew@newznab              inno
zdefect@newznab             anidb
ueland@newznab              installer
ensi@ensisoft           	api
hecks@tvnzb                 rar api
michael@newznab             dev
l2g@newznab             	nfo,spotnab
danza@newznab            	dev
sakarias@newznab            testing
pairdime@sabnzbd            jquery, css
pmow@sabnzbd                headers, backfill
poutine@newznab             recaptcha
robv@newznab             	init.d
bigdave@newznab             testing
duz@sabnzbd                 yenc
inpheaux@sabnzbd            design, nzb
spooge@newznab              testing
sy@newznab                  testing, regexs, amazon
magegminds@newznab          lighttpd rewrite rules
trizz@newznab               lighttpd rewrite rules
emanon@newznab              testing
fubaarr@newznab             testing
mobiKalw@newznab            testing
crudehung@newznab           nginx rewrite rules
f0rmed@newznab				testing
frikish@github				theme
www.famfamfam.com           icons
wally73@newznab             dev
=========================   =============================