GitHub - nubzzz/newznab: Mirror of newznab SVN

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 1,431 Commits
db		db
misc		misc
www		www
CHANGELOG.txt		CHANGELOG.txt
INSTALL.txt		INSTALL.txt
LICENSE.txt		LICENSE.txt
README.txt		README.txt

Repository files navigation

ABOUT
Newznab is a PHP/Smarty application, which supports the indexing of usenet headers
into a mysql database and provides a simple web based search interface onto the data.

It includes simple CMS facilities, SEO friendly URLs and is designed with the intention
of allowing users to create a community around their index.

For information on how to install, please refer to INSTALL.txt

To discuss use irc.synirc.net #newznab

Newznab is licensed under terms of the GNU General Public License. For details, please
refer to LICENSE.txt.

HOW IT WORKS
usenet groups are specified, message headers (binaries and parts) are downloaded for the
groups, releases are created from completed sets of binaries by applying regex to the message subject.
releases are categorised by regexing the message subject. metadata from tvrage, tmdb, imdb and amazon
are applied toeach created release. after a configurable number of days the header data is deleted
from the database, but the releases remain.

CHOOSING NEWSGROUPS
groups can be manually entered if you know the name. groups can also be bulk added when
specified as a regular expression. for example if you want to index the groups alt.bin.blah.*
and alt.bin.other use the value 'alt.bin.blah.*|alt.bin.other'.

UPDATING INDEX (populating binaries + parts)
the recommended way to schedule updates is via the dos and unix start scripts in
/path/to/newznab/misc/update_scripts/. make sure you set the paths correctly to your installation.

CATEGORISATION
most categorisation of releases is done at the time of applying the regex. however if no category
is supplied for a regex then \www\lib\category.php contains the logic which attempts to map a
release to a site category. site categories are used to make browsing nzbs easier. add new categories
by updating the category table, and adding a new Category::constant. Then map it in the
function determineCategory()

MISSING PARTS
when headers are requested from the usenet provider, they are asked for in number ranges
e.g. 1-1000, 1001-2000 etc. for various reasons sometimes the provider does not return
a header, this is not always because the header does not exist, there may be some synchronisation
going on at the providers end. if a header is requested but not returned, we store a record of this
in the table partrepair. each time update_binaries is ran an attempt is made to go back and get the
missing parts. if after five attempts the parts can still not be obtained, newznab gives up.
when update_releases runs, if a release is seen to have missing parts it will not be released until
four hours after it was uploaded to usenet. this is so a chance has been made to repair all missing
parts. after four hours a release will be created anyway and its down to the quality of the par files
to determine whether a release can be correctly unpacked.

BACKFILLING GROUPS
since most usenet providers have 800+ days of retention indexing all that information in one shot
is not practical. newznab provides a backfill feature that allow you to index past articles once
your initial index has been built. to use the feature first set the back fill days setting in the group(s)
to be backfilled to the number of day you wish to go back, making sure to set it higher than the number
of days listed in the first post column. once set run the backfill.php script in misc/update_scripts.
groups can be backfilled to a particular date using the script misc/update_scripts/backfill_date.php
using the syntax php backfill_date.php 2011-05-15 alt.binaries.groupname

REGEX MATCHING
releases are created by applying regexs to binary message subjects. different regexes
are applied to binaries from different newsgroups. catchall regexes are applied to any
binaries left unmatched after the group specific matching. a category can be associated
with a regex, which will allow the processing of groups like inner-sanctum which contain a
combination of different binary types.

REGEX UPDATING
regexes in the system in the range 0-10000 are system defined and are updated centrally.
everytime processreleases is ran, a check will be performed to see if you have the latest regexs.
if you do not want this check to be made then set site.latestregexurl to null

NZB FILE STORAGE
nzbs are saved to disk gzipped at the location specified by site.nzbpath in subdirs based on the
first char of the release guid, this just makes the dirs a bit easier to manage when you have thousands
of nzb.gz files. the default path is /website/../nzbfiles

SSL USENET CONNECTION
Install the OpenSSL extension, set config.php define ('NNTP_SSLENABLED', true);

IMPORTING/EXPORTING NZBS
.nzb files can be imported from the admin interface (or cli). importing is a convenient way to fill the
index without trawling a large backdated number of usenet messages. after running an import
the processReleases() function must be run to create valid releases. nzbs can also be exported
based on system categories.

GOOGLE ADS/ANALYTICS
to integrate google analytics and adsense provide enter the adsense ad module ids
into the site table for the sidepanel (right), searchbox (bottom of menu).
providing an analytics id will include the analytics js in the footer

ADMIN
admin functions all live in /admin/ which is only accessible by users with admin role.
set users.role to be 2 on the users you wish to be admins

TVRAGE
after releases.processReleases() is called, an attempt is made to determine the tvrageids for every
release which looks like its tv. this also works out the series/episode columns. the data in the
tvrage table will become populated from best guesses from the rage search api. if some of these
guesses are wrong, you can manually edit the rage data in the admin interface, and use the
remove link to wipe any releases which have that rageid and then manually call 'process tv' which
will attempt to relink rage data. when a new release is created it goes in with a release.rageid of -1
when tv is processed, the rageid either goes to the best guess, or to -2, which indicates no match
could be made, or the release isnt percieved to be tv.

TVRAGE - SERIES/EPISODE INFO
information about a particular episode of a series is requested from services.tvrage.com a shared
newznab api key is used to retrieve this data. it assigns an 'aired date' to each release if a match
is found.

NFO
nfos are attempted to be retrieved using a queuing method. there will be a number of attempts to get
an nfo before giving up

IMDB/TMDB
if enabled, and if an imdb id is found in the nfo, the application will attempt to use that imdb id to
get general data about the movie (title, year, genre, covers etc) from themoviedb.org. If no entry is
available from tmdb then an attempt to gather the info from imdb.com is made. any results are stored
in the moveinfo table, with covers/backdrops being saved to the images/covers/.

3RD PARTY API KEYS
in order to do lookups to tmdb and amazon, api keys are used. newznab is shipped with some default
keys, but due to the restrictions on use of api's, it is strongly suggested you go and get your own
api keys and save them in the site edit page.

CONTENT/CMS
pages can be added to the site with seo friendly urls via the /admin/ edit content links

SKINNING/THEMES
avoid custom edits to code and stylesheets to make updating painless.
override any styles by creating a folder \www\theme\<yourtheme>\style.css. stick any custom images in
\www\theme\<yourtheme>\images\
pick the theme in the admin\site-edit page. your styles should override the existing style sheet, which
should always be loaded in.

API
www.sitename.com/api? provides api access to query and retrieve nzbs.
call www.sitename.com/apihelp to see help doc with all available options. users either have to be
logged in or provide their rsstoken. users can use their rsstoken to access both rss + api.
full details of the api and how to implement it are provided in /misc/api/

FAQS
* authorisation rejected from nntp server - check you have enough available connections
not currently in use
* white screen - probably php error not being displayed to browser or session timed out and 403 being throw
* Lots of binary headers processed but few releases created - The binary headers downloaded do not match
the regexes used to create a release. The message headers must follow popular formats in order for releases
to be created.
* search and rawsearch requests lose page css styling - use the provided apache vhost settings.
* Server did not return article numbers 1234567 - this isnt necessarily a bad thing, see section on missing parts
* connection timed out. Reconnecting... Cannot connect to server *******: Already connected, disconnect first! -
disable compressed headers
* session error during install step1 - set register_globals to off.
* Warning: Wrong parameter count for strstr() in newznab\www\lib\TMDb.php on line 354 - wrong php version, requires 5.3+
* Strict Standards: Non-static method PEAR::isError() should not be called statically - disable strict error reporting
* Error 502 Bad Gateway - error at $cfg->pearCheck = @include('System.php'); solved by adding in open_basedir path to pear
* Call to undefined function curl_init() - Make sure you are using the right php.ini file. If you are using WAMP, then
the php.ini file that apache uses is in the apache /bin folder (not the php.ini in wamp/php). The php cli will use
the first php.ini it can find in the windows path environment variable. In my case, this was an old version in
another php directory I set up. Once I deleted that, it used the version in the /wamp/php directory.

DEBUGGING
switch php.ini error_reporting to E_ALL and ensure logging to browser is enabled.

DEVELOPMENT
\db\schema.sql is latest database schema. should be able to rerun in and create new blank schema
\misc\ for general docs and useful info, nothing in here is referenced by the application
\misc\update_scripts\ shell and batch scripts and php files to call the updating of index from cli
\nzbfiles default folder for all gzipped nzbs to be stored
\www\install installer files
\www\lib\framework few general classes for db/http code
\www\lib\smarty copy of a fairly recent smarty lib
\www\lib\ all classes used in the app, typically named same as its database entity
\www\covers\ all covers downloaded for releases
\www\views\templates\admin all templates used by the admin pages
\www\views\templates\frontend all templates used by the user pages
\www\pages\ controllers for every frontend page in the system
\www\admin\ all php pages used by the admin
\www\theme\<yourtheme> blank area for implementation specific ui customisations
\www\views\scripts\ js dumping ground

HALL OF FAME
(just some of the) people who've helped along the way.
iota@irc.cyberarmy.net regexs,sessions
enstyne@irc.cyberarmy.net regexs
fatfecker@newznab mediainfo ffmpeg tv
gizmore@wechall.net password,hash
lhbandit@nzbs.org yenc,nntp,bokko,dev
dryes@nzbs.org anidb
bb@newznab dev
keyvan@newznab backfill
jayhawk@nzb.su testing,icons
midgetspy@sickbeard rage integration,api
ueland@newznab installer
ensi@ensisoft.com api
hecks@tvnzb rar api
michael@newznab dev
sakarias@newznab testing
pairdime@sabnzbd jquery,css
pmow@sabnzbd headers,backfill
bigdave@newznab testing
duz@sabnzbd yenc
inpheaux@sabnzbd design,nzb
spooge@newznab testing
sy@newznab testing, regexs, amazon
magegminds@newznab lighttpd rewrite rules
trizz@newznab lighttpd rewrite rules
fubaarr@newznab testing
mobiKalw@newznab testing
crudehung@newznab nginx rewrite rules
www.famfamfam.com icons