<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Nadav Samet&#039;s Blog &#187; howto</title>
	<atom:link href="http://www.thesamet.com/blog/tags/howto/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.thesamet.com</link>
	<description>Because everyone needs a blog</description>
	<lastBuildDate>Sat, 07 Aug 2010 21:58:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>How to Gain Root Access on SunOS (a 1-day exploit)</title>
		<link>http://www.thesamet.com/blog/2007/02/12/how-to-gain-root-access-on-sunos-a-1-day-exploit/</link>
		<comments>http://www.thesamet.com/blog/2007/02/12/how-to-gain-root-access-on-sunos-a-1-day-exploit/#comments</comments>
		<pubDate>Mon, 12 Feb 2007 21:54:12 +0000</pubDate>
		<dc:creator>thesamet</dc:creator>
				<category><![CDATA[howto]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[unix]]></category>

		<guid isPermaLink="false">http://www.thesamet.com/blog/2007/02/12/how-to-get-root-access-on-sunos-a-1-day-exploit/</guid>
		<description><![CDATA[If you happen to find a Sun Solaris server with a telnet daemon running, it is very likely that you can get superuser access on it by just typing: $ telnet -l "-froot" server where server is the server name. &#8230; <a href="http://www.thesamet.com/blog/2007/02/12/how-to-gain-root-access-on-sunos-a-1-day-exploit/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>If you happen to find a Sun Solaris server with a telnet daemon running, it is very likely that you can get superuser access on it by just typing:</p>
<pre>
$ telnet -l "-froot" server</pre>
<p>where <code>server</code> is the server name. I was able to confirm this on a Solaris server nearby.</p>
<p><div style="float: right; margin-left: 1.5em;">
<script type="text/javascript"><!--
google_ad_client = "pub-9393140612616722";
google_ad_width = 300;
google_ad_height = 250;
google_ad_format = "300x250_as";
google_ad_type = "text_image";
//2006-12-13: blog_square
google_ad_channel = "2605264114";
google_color_border = "FFFFFF";
google_color_bg = "FFFFFF";
google_color_link = "0070D0";
google_color_text = "000000";
google_color_url = "0070D0";
//--></script>
<script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></div>It&#8217;s amazing to see that this one was overlooked for SO much time, and how using this exploit does not require any skill whatsoever. If root logins through telnet are disabled, you may still be able to login as any other user (think sysadmin&#8217;s user account + keystroke recorder)</p>
<p>While the telnet port is usually blocked to servers on the internet, it is quite common that it is left open inside local networks, and especially in universities. So go ahead and look for Solaris rootkits &#8212; the exam period is just over the corner <img src='http://www.thesamet.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Source: <a href="http://erratasec.blogspot.com/2007/02/trivial-remote-solaris-0day-disable.html">Errata Security blog</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thesamet.com/blog/2007/02/12/how-to-gain-root-access-on-sunos-a-1-day-exploit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pumping Up Your Applications with Xapian Full-Text Search</title>
		<link>http://www.thesamet.com/blog/2007/02/04/pumping-up-your-applications-with-xapian-full-text-search/</link>
		<comments>http://www.thesamet.com/blog/2007/02/04/pumping-up-your-applications-with-xapian-full-text-search/#comments</comments>
		<pubDate>Sun, 04 Feb 2007 06:05:41 +0000</pubDate>
		<dc:creator>thesamet</dc:creator>
				<category><![CDATA[howto]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[turbogears]]></category>

		<guid isPermaLink="false">http://www.thesamet.com/blog/2007/02/04/pumping-up-your-applications-with-xapian-full-text-search/</guid>
		<description><![CDATA[What good is an application—not matter how much information it contains—if the inability to easily search it renders it useless? Xapian to the Rescue Xapian is an excellent open source (GPL) search engine library. It is written in C++ and &#8230; <a href="http://www.thesamet.com/blog/2007/02/04/pumping-up-your-applications-with-xapian-full-text-search/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.thesamet.com/blog/wp-content/uploads/2007/02/needle_small.jpg" title="Needle Small" alt="Needle Small" align="right" />What good is an application—not matter how much information it contains—if the inability to easily search it renders it useless?</p>
<h2>Xapian to the Rescue</h2>
<p><a href="http://www.xapian.org/">Xapian</a> is an excellent open source (GPL) search engine library. It is written in C++ and comes with bindings for Python as well as many other languages, and it supports everything you&#8217;d expect from a modern search engine:</p>
<ul>
<li><strong>Ranked probabilistic search</strong> – The results that are returned are ranked according to their relevancy, with the most relevant occurring first.</li>
<li><strong>Boolean search operators</strong> – You can use AND, OR, NOT, XOR in your searches.</li>
<li><strong>Phrase and proximity searching</strong> – For example, &#8220;used books&#8221; will look for occurrences of these words as an exact phrase, but you can also search for &#8220;used NEAR books&#8221; to find occurrences of the words &#8220;used&#8221; and &#8220;books&#8221; that are within 10 words of each other. You can even write &#8220;used NEAR/3 books&#8221; to change the proximity threshold to three</li>
<li><strong>Stemming of search terms</strong> – If, for example, you search for &#8220;programmer,&#8221; you can find articles that mention &#8220;programmers&#8221; or &#8220;programming.&#8221; Xapian currently supports stemming in Danish, Dutch, English, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish</li>
<li><strong>Stopwords</strong> – Xapian already knows which common words to ignore (like &#8216;are,&#8217; &#8216;is,&#8217; and &#8216;being&#8217;)</li>
<li><strong>Simultaneous update and searching</strong> – Xapian allows to index new articles while the database is being searched. New articles can be searched right away.</li>
<li><strong>Relevant Suggestions</strong> – Xapian can automatically suggest documents that are relevant to a given document. As such, you can add a list of &#8220;similar items&#8221; to each page.</li>
<li><strong>Value-associated results</strong> – You can associate values like word-count, date, page views, diggs, and so on with each document. Xapian can return results that are sorted by any of these criteria</li>
<li><strong>Document metadata</strong> – You can add metadata tags to each document (Xapian calls these terms). These tags can be anything you desire, like author, title, date, tags, and so on. Users can then search within the metadata by typing &#8220;author:john&#8221;.</li>
</ul>
<p style="text-align: center"><a href="http://www.thesamet.com/blog/wp-content/uploads/2007/02/xapian.png" title="Xapian diagram medium"><img src="http://www.thesamet.com/blog/wp-content/uploads/2007/02/xapian_med.png" alt="Xapian diagram medium" border="0" /></a></p>
<p>The above diagram shows the main participants in a typical search-enabled use case. We assume that the data to be searched is stored in a relational database (the blue SQL server jar), but it doesn’t really matter where the data comes from. The indexer is a Python program that is executed periodically (as a cron job). Its function is to retrieve new or changed documents from the database and to index them. The Xapian library handles the actual read/write operations on the Xapian database (in the purple jar).</p>
<p>Since the Xapian library is not thread-safe and because Web applications are usually multithreaded, you need to implement a locking mechanism if you want access to a Xapian database to be safe. My preferred way for accomplishing this aim is to use a separate process (the orange Search Server box). This process will be a single-threaded RPC server that will handle all searches. The benefit of this strategy is that you can move the search server process (together with the Xapian database) to a different machine. In so doing, you can free up a lot of the resources on that server that runs your application. That makes the system very scalable. In general, you can expect any bottlenecks to be more IO and memory related than CPU related.</p>
<p>Alternatively, your search operations can be directly initiated from your application process. This alternative will work as long as you use a mutex to govern access to your database. However, I wouldn&#8217;t recommend doing this in a production environment. Why? Because you&#8217;ll never be sure what’s consuming so much memory—the Xapian library or your application.</p>
<p>The application (red box) gets a very clean search API. It simply connects to the XML RPC server (one line of code) and obtains access to a search() method which gets the search query and how many results are needed as arguments. It then returns a dictionary with the total number of available results and the results themselves.</p>
<p>In this tutorial, we&#8217;ll create a searchable article database. We assume that you already have Xapian and the Xapian bindings installed. We&#8217;ll start with the indexer.</p>
<h2>The Indexer: The Golden Retriever</h2>
<p><img src="http://www.thesamet.com/blog/wp-content/uploads/2007/02/indexer2.jpg" title="Golden Document Retriever" alt="Golden Document Retriever" align="right" />Following is the indexer code. It is tailored to TurboGears and SQLAlchemy, but it can be easily adapted to suit any ORM. It accepts three command line arguments: the configuration file, which helps it find the database (either dev.cfg or prod.cfg); the Xapian database location, which is simply a directory name; and a number of hours (such that all documents that were created or modified within this number of hours will be indexed). If you run the indexer every hour, you can safely give 2 as the third argument. If you&#8217;d like to re-index all articles, pass in 0 as the third argument.</p>
<div class="dean_ch" style="white-space: wrap;">
<p><span class="co1">#!/usr/bin/env python</span><br />
<span class="kw1">from</span> <span class="kw3">datetime</span> <span class="kw1">import</span> *<br />
<span class="kw1">import</span> xapian</p>
<p>WORD_RE = <span class="kw3">re</span>.<span class="kw2">compile</span><span class="br0">&#40;</span>r<span class="st0">&quot;<span class="es0">\\</span>w{1,32}&quot;</span>, <span class="kw3">re</span>.<span class="me1">U</span><span class="br0">&#41;</span><br />
ARTICLE_ID = <span class="nu0">0</span></p>
<p>stemmer = xapian.<span class="me1">Stem</span><span class="br0">&#40;</span><span class="st0">&quot;en&quot;</span><span class="br0">&#41;</span> <span class="co1"># english stemmer</span></p>
<p><span class="kw1">def</span> create_document<span class="br0">&#40;</span>article<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; <span class="st0">&quot;&quot;</span><span class="st0">&quot;Gets an article object and returns<br />
&nbsp; &nbsp; a Xapian document representing it and<br />
&nbsp; &nbsp; a unique article identifier.&quot;</span><span class="st0">&quot;&quot;</span></p>
<p>&nbsp; &nbsp; doc = xapian.<span class="me1">Document</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; text = article.<span class="me1">text</span>.<span class="me1">encode</span><span class="br0">&#40;</span><span class="st0">&quot;utf8&quot;</span><span class="br0">&#41;</span></p>
<p>&nbsp; &nbsp; <span class="co1"># go word by word, stem it and add to the</span><br />
&nbsp; &nbsp; <span class="co1"># document.</span><br />
&nbsp; &nbsp; <span class="kw1">for</span> index, term <span class="kw1">in</span> <span class="kw2">enumerate</span><span class="br0">&#40;</span>WORD_RE.<span class="me1">finditer</span><span class="br0">&#40;</span>text<span class="br0">&#41;</span><span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; doc.<span class="me1">add_posting</span><span class="br0">&#40;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; stemmer.<span class="me1">stem_word</span><span class="br0">&#40;</span>term.<span class="me1">group</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; index<span class="br0">&#41;</span><br />
&nbsp; &nbsp; doc.<span class="me1">add_term</span><span class="br0">&#40;</span><span class="st0">&quot;A&quot;</span>+article.<span class="me1">submitted_by</span>.<span class="me1">user_name</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; doc.<span class="me1">add_term</span><span class="br0">&#40;</span><span class="st0">&quot;S&quot;</span>+article.<span class="me1">subject</span>.<span class="me1">encode</span><span class="br0">&#40;</span><span class="st0">&quot;utf8&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; article_id_term = <span class="st0">&#8216;I&#8217;</span>+<span class="kw2">str</span><span class="br0">&#40;</span>article.<span class="me1">article_id</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; doc.<span class="me1">add_term</span><span class="br0">&#40;</span>article_id_term<span class="br0">&#41;</span><br />
&nbsp; &nbsp; doc.<span class="me1">add_value</span><span class="br0">&#40;</span>ARTICLE_ID, <span class="kw2">str</span><span class="br0">&#40;</span>article.<span class="me1">article_id</span><span class="br0">&#41;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">return</span> doc, article_id_term</p>
<p><span class="kw1">def</span> index<span class="br0">&#40;</span>db, period<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; <span class="st0">&quot;&quot;</span><span class="st0">&quot;Index all articles that were modified<br />
&nbsp; &nbsp; in the last &lt;period&gt; hours into the given<br />
&nbsp; &nbsp; Xapian database&quot;</span><span class="st0">&quot;&quot;</span></p>
<p>&nbsp; &nbsp; <span class="kw1">if</span> period:<br />
&nbsp; &nbsp; &nbsp; &nbsp; start = <span class="kw3">datetime</span>.<span class="me1">now</span><span class="br0">&#40;</span><span class="br0">&#41;</span>-timedelta<span class="br0">&#40;</span>hours=period<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; query = <span class="br0">&#40;</span>Article.<span class="me1">q</span>.<span class="me1">last_modified</span>&gt;start<span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; query = <span class="kw2">None</span><br />
&nbsp; &nbsp; articles = session.<span class="me1">query</span><span class="br0">&#40;</span>Article<span class="br0">&#41;</span>.<span class="kw3">select</span><span class="br0">&#40;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;query<span class="br0">&#41;</span></p>
<p>&nbsp; &nbsp; <span class="kw1">for</span> article <span class="kw1">in</span> articles:<br />
&nbsp; &nbsp; &nbsp; &nbsp; doc, id_term = create_document<span class="br0">&#40;</span>article<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;db.<span class="me1">replace_document</span><span class="br0">&#40;</span>id_term, doc<span class="br0">&#41;</span></p>
<p><span class="kw1">if</span> __name__==<span class="st0">&quot;__main__&quot;</span>:<br />
&nbsp; &nbsp; configfile, dbpath, period = <span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#91;</span><span class="nu0">1</span>:<span class="br0">&#93;</span><br />
&nbsp; &nbsp; &nbsp;turbogears.<span class="me1">update_config</span><span class="br0">&#40;</span>configfile=configfile,<br />
&nbsp; &nbsp; &nbsp; &nbsp; modulename=<span class="st0">&quot;myproject.config&quot;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">from</span> myproject.<span class="me1">model</span> <span class="kw1">import</span> *<br />
&nbsp; &nbsp; turbogears.<span class="me1">database</span>.<span class="me1">bind_meta_data</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; db = xapian.<span class="me1">WritableDatabase</span><span class="br0">&#40;</span>dbpath,<br />
&nbsp; &nbsp; &nbsp; &nbsp; xapian.<span class="me1">DB_CREATE_OR_OPEN</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; index<span class="br0">&#40;</span>db, <span class="kw2">int</span><span class="br0">&#40;</span>period<span class="br0">&#41;</span><span class="br0">&#41;</span></div>
<p>All strings that are passed to Xapian functions must be encoded in UTF-8. The create_document() function splits the article’s text into words, stems them, and then adds them one by one to the Xapian document. Next, a term with the username of the article’s author, prefixed by the letter &#8216;A,&#8217; and the article’s subject, prefixed by the letter &#8216;S&#8217; is added. Xapian gives special treatment to the first character of each term (i.e., it gives the terms its meaning). You&#8217;ll see how we use these terms when we code the search server.</p>
<p>Another term, this one prefixed by the letter &#8216;I&#8217;,&#8217; is now added to render a unique article ID. The article ID is also assigned to the document as a value. This number relates a Xapian document to its authentic source in the SQL server.</p>
<p>The index() method function simply selects the relevant articles and builds a Xapian document object for them. Instead of using add_document(), which can cause an article to be indexed multiple times in the database, we use replace_document(), which is given a unique term. If a document is already indexed by the given term, it will be replaced with the given document; otherwise, a new document will be added to the database.</p>
<p>After the data is indexed, it is time to make it searchable.</p>
<h2>The Search Server: Seeing Results</h2>
<p>The role of the search server is to obtain queries from the application and to then return results. As we strive for a single-threaded implementation, the Twisted framework makes it extremely easy to write the code for this server. If you are not familiar with Twisted or XML RPC, don&#8217;t worry; just imagine we’re writing a controller with only one method exposed: xmlrpc_search().</p>
<div class="dean_ch" style="white-space: wrap;">
<span class="kw1">import</span> xapian</p>
<p><span class="kw1">from</span> twisted.<span class="me1">web</span> <span class="kw1">import</span> xmlrpc, server<br />
<span class="kw1">from</span> twisted.<span class="me1">internet</span> <span class="kw1">import</span> reactor, task<br />
<span class="kw1">from</span> <span class="kw3">time</span> <span class="kw1">import</span> <span class="kw3">time</span><br />
<span class="kw1">from</span> indexer <span class="kw1">import</span> ARTICLE_ID</p>
<p>DEFAULT_SEARCH_FLAGS = <span class="br0">&#40;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; xapian.<span class="me1">QueryParser</span>.<span class="me1">FLAG_BOOLEAN</span> |<br />
&nbsp; &nbsp; &nbsp; &nbsp; xapian.<span class="me1">QueryParser</span>.<span class="me1">FLAG_PHRASE</span> |<br />
&nbsp; &nbsp; &nbsp; &nbsp; xapian.<span class="me1">QueryParser</span>.<span class="me1">FLAG_LOVEHATE</span> |<br />
&nbsp; &nbsp; &nbsp; &nbsp; xapian.<span class="me1">QueryParser</span>.<span class="me1">FLAG_BOOLEAN_ANY_CASE</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#41;</span></p>
<p><span class="kw1">class</span> SearchServer<span class="br0">&#40;</span>xmlrpc.<span class="me1">XMLRPC</span><span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; <span class="kw1">def</span> <span class="kw4">__init__</span><span class="br0">&#40;</span><span class="kw2">self</span>, db<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; xmlrpc.<span class="me1">XMLRPC</span>.<span class="kw4">__init__</span><span class="br0">&#40;</span><span class="kw2">self</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="me1">db</span> = xapian.<span class="me1">Database</span><span class="br0">&#40;</span>db<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="kw3">parser</span> = xapian.<span class="me1">QueryParser</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="kw3">parser</span>.<span class="me1">add_prefix</span><span class="br0">&#40;</span><span class="st0">&quot;author&quot;</span>, <span class="st0">&quot;A&quot;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="kw3">parser</span>.<span class="me1">add_prefix</span><span class="br0">&#40;</span><span class="st0">&quot;subject&quot;</span>, <span class="st0">&quot;S&quot;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="kw3">parser</span>.<span class="me1">set_stemmer</span><span class="br0">&#40;</span>xapian.<span class="me1">Stem</span><span class="br0">&#40;</span><span class="st0">&quot;en&quot;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="kw3">parser</span>.<span class="me1">set_stemming_strategy</span><span class="br0">&#40;</span>xapian.<span class="me1">QueryParser</span>.<span class="me1">STEM_SOME</span><span class="br0">&#41;</span></p>
<p>&nbsp; &nbsp; &nbsp; &nbsp; <span class="co1"># make sure database is reloaded every 10 minutes</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; lc = task.<span class="me1">LoopingCall</span><span class="br0">&#40;</span><span class="kw2">self</span>.<span class="me1">reopen_db</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; lc.<span class="me1">start</span><span class="br0">&#40;</span><span class="nu0">600</span><span class="br0">&#41;</span></p>
<p>&nbsp; &nbsp; <span class="kw1">def</span> reopen_db<span class="br0">&#40;</span><span class="kw2">self</span><span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="me1">db</span>.<span class="me1">reopen</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">except</span> <span class="kw2">IOError</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">print</span> <span class="st0">&quot;Unable to open database&quot;</span></p>
<p>&nbsp; &nbsp; <span class="kw1">def</span> xmlrpc_search<span class="br0">&#40;</span><span class="kw2">self</span>, query, offset, count<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="st0">&quot;&quot;</span><span class="st0">&quot;Search the database for &lt;query&gt;, return<br />
&nbsp; &nbsp; &nbsp; &nbsp; results[offest:offset+count], sorted by relevancy&quot;</span><span class="st0">&quot;&quot;</span></p>
<p>&nbsp; &nbsp; &nbsp; &nbsp; query = <span class="kw2">self</span>.<span class="kw3">parser</span>.<span class="me1">parse_query</span><span class="br0">&#40;</span>query.<span class="me1">encode</span><span class="br0">&#40;</span><span class="st0">&quot;utf8&quot;</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; DEFAULT_SEARCH_FLAGS<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; enquire = xapian.<span class="me1">Enquire</span><span class="br0">&#40;</span><span class="kw2">self</span>.<span class="me1">db</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; enquire.<span class="me1">set_query</span><span class="br0">&#40;</span>query<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mset = enquire.<span class="me1">get_mset</span><span class="br0">&#40;</span>offset, count<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">except</span> <span class="kw2">IOError</span>, e:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="st0">&quot;DatabaseModifiedError&quot;</span> <span class="kw1">in</span> <span class="kw2">str</span><span class="br0">&#40;</span>e<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="me1">reopen_db</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">raise</span></p>
<p>&nbsp; &nbsp; &nbsp; &nbsp; results = <span class="br0">&#91;</span><span class="br0">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">for</span> m <span class="kw1">in</span> mset:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; results.<span class="me1">append</span><span class="br0">&#40;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#123;</span><span class="st0">&quot;percent&quot;</span>: m<span class="br0">&#91;</span>xapian.<span class="me1">MSET_PERCENT</span><span class="br0">&#93;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="st0">&quot;article_id&quot;</span>: m<span class="br0">&#91;</span>xapian.<span class="me1">MSET_DOCUMENT</span><span class="br0">&#93;</span>.<span class="me1">get_value</span><span class="br0">&#40;</span>ARTICLE_ID<span class="br0">&#41;</span><span class="br0">&#125;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> <span class="br0">&#123;</span><span class="st0">&quot;count&quot;</span>: mset.<span class="me1">get_matches_upper_bound</span><span class="br0">&#40;</span><span class="br0">&#41;</span>, <span class="st0">&quot;results&quot;</span>: results<span class="br0">&#125;</span><br />
<span class="kw1">import</span> <span class="kw3">sys</span></p>
<p><span class="kw1">if</span> <span class="kw2">len</span><span class="br0">&#40;</span><span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#41;</span>!=<span class="nu0">3</span>:<br />
&nbsp; &nbsp; <span class="kw1">print</span> <span class="st0">&quot;Usage: search.py &lt;port&gt; &lt;db&gt;&quot;</span><br />
&nbsp; &nbsp; <span class="kw3">sys</span>.<span class="me1">exit</span><span class="br0">&#40;</span><span class="nu0">1</span><span class="br0">&#41;</span></p>
<p>reactor.<span class="me1">listenTCP</span><span class="br0">&#40;</span><span class="kw2">int</span><span class="br0">&#40;</span><span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#91;</span><span class="nu0">1</span><span class="br0">&#93;</span><span class="br0">&#41;</span>, server.<span class="me1">Site</span><span class="br0">&#40;</span>SearchServer<span class="br0">&#40;</span><span class="kw3">sys</span>.<span class="me1">argv</span><span class="br0">&#91;</span><span class="nu0">2</span><span class="br0">&#93;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><span class="br0">&#41;</span><br />
reactor.<span class="me1">run</span><span class="br0">&#40;</span><span class="br0">&#41;</span></div>
<p>The search server constructor opens a database and initializes a query parser. We tell the query parser that the keyword &#8216;author&#8217; refers to terms that are prefixed with the letter &#8216;A&#8217; and that the keyword &#8216;subject&#8217; refers to terms that are prefixed with the letter &#8216;S.&#8217; This specification makes it possible to search for &#8220;author:john&#8221; or &#8220;subject:xapian.&#8221;</p>
<p>We then instruct Twisted to call reopen_db() every ten minutes. This reopening renders the latest changes in the database available to the search server. Each time Xapian&#8217;s library opens the database, it works against a fixed revision of it.</p>
<p>The xmlrpc_search() is the only method that is exposed (since its name is prefixed by xmlrpc_). The offset (zero-based) and limit arguments allow for an efficient way to split the search results into several pages. The call returns a dictionary with the total number of available results and a list with the selected subset of results. Each item in the list contains a unique article_id and a percent, which indicates each document’s relevancy score.</p>
<p>The program receives the port number to listen on and the Xapian database path. Unless you&#8217;d like to expose your search functionality to the world, it is suggested that you block outside access to this port.</p>
<p>By now, you’re probably eager to try searching your own database. Here&#8217;s a quick way to do so. First, start the search server:</p>
<div class="dean_ch" style="white-space: wrap;">
$ python search.py 3000 ./my_database</div>
<p>From another terminal, start a Python shell:</p>
<div class="dean_ch" style="white-space: wrap;">
&gt;&gt;&gt; <span class="kw1">import</span> <span class="kw3">xmlrpclib</span><br />
&gt;&gt;&gt; s = <span class="kw3">xmlrpclib</span>.<span class="me1">Server</span><span class="br0">&#40;</span><span class="st0">&#8216;http://localhost:3000&#8242;</span><span class="br0">&#41;</span><br />
&gt;&gt;&gt; s.<span class="me1">search</span><span class="br0">&#40;</span><span class="st0">&#8216;python -snake&#8217;</span>, <span class="nu0">0</span>, <span class="nu0">10</span><span class="br0">&#41;</span><br />
<span class="br0">&#123;</span><span class="st0">&#8216;count&#8217;</span>: <span class="nu0">2</span>, <span class="st0">&#8216;results&#8217;</span>: <span class="br0">&#91;</span><span class="br0">&#123;</span><span class="st0">&#8216;percent&#8217;</span>: <span class="nu0">94</span>, <span class="st0">&#8216;article_id&#8217;</span>: <span class="nu0">15</span><span class="br0">&#125;</span>, <span class="br0">&#123;</span><span class="st0">&#8216;percent&#8217;</span>: <span class="nu0">79</span>, <span class="st0">&#8216;article_id&#8217;</span>: <span class="nu0">6</span><span class="br0">&#125;</span><span class="br0">&#93;</span><span class="br0">&#125;</span></div>
<p>In the same manner, you can use the search server from your application.</p>
<h2>Working with Smaller Databases</h2>
<p>Search engines are optimized to return results that are sorted according to their relevancy. If you need your results sorted by another criterion, such as date or diggs, it might be useful to run the query over a smaller database. For example, you might try running it over a database that contains only articles from the previous month. This strategy can significantly increase your overall performance.</p>
<h2>Alternatives to Xapian</h2>
<p>While I haven&#8217;t tried working with search engines libraries other than Xapian, you can try the Java-based <a href="http://lucene.apache.org/">Lucene</a> which can be accessed from Python using <a href="http://pylucene.osafoundation.org/">PyLucene</a>. The <a href="http://dev.krys.ca/turbolucene">TurboLucene</a> library eases using PyLucene from TurboGears.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thesamet.com/blog/2007/02/04/pumping-up-your-applications-with-xapian-full-text-search/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>How to Stop Spam On phpBB Forums</title>
		<link>http://www.thesamet.com/blog/2006/12/21/fighting-spam-on-phpbb-forums/</link>
		<comments>http://www.thesamet.com/blog/2006/12/21/fighting-spam-on-phpbb-forums/#comments</comments>
		<pubDate>Thu, 21 Dec 2006 13:08:02 +0000</pubDate>
		<dc:creator>thesamet</dc:creator>
				<category><![CDATA[howto]]></category>

		<guid isPermaLink="false">http://www.thesamet.com/blog/2006/12/21/fighting-spam-on-phpbb-forums/</guid>
		<description><![CDATA[February 2009 Update: it seems that this page got very popular, and it&#8217;s not surprising: the method described here is simple and really works. Unfortunately due to the large number of help requests I am getting, I can&#8217;t provide you &#8230; <a href="http://www.thesamet.com/blog/2006/12/21/fighting-spam-on-phpbb-forums/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="background: #eee; border: 1px solid black; padding: 0.5em;"><strong>February 2009 Update: </strong> it seems that this page got very popular, and it&#8217;s not surprising: the method described here is simple and really works. Unfortunately due to the large number of help requests I am getting, I can&#8217;t provide you personal assistance in implementing the solution described. If you would like one of my developers to implement this solution on your forum, feel free to contact us at <a href="http://www.stopforumspamnow.com">Stop Forum Spam Now</a> and we will be glad to help you.</div>
<p><div style="float: right; margin-left: 1.5em;">
<script type="text/javascript"><!--
google_ad_client = "pub-9393140612616722";
google_ad_width = 300;
google_ad_height = 250;
google_ad_format = "300x250_as";
google_ad_type = "text_image";
//2006-12-13: blog_square
google_ad_channel = "2605264114";
google_color_border = "FFFFFF";
google_color_bg = "FFFFFF";
google_color_link = "0070D0";
google_color_text = "000000";
google_color_url = "0070D0";
//--></script>
<script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></div>I am running several phpBB-based forums, and they all started receiving serious amounts of spam recently. It seems that the spammers are now able to break the captcha in the registration and even pass the e-mail activation. I found a very simple solution for this. And from that moment on &#8211; the spam stopped.</p>
<p>The idea is to ask the spam bot a question which it does not expect, but it will be no problem for the users to answer. I&#8217;ve added to the registration form the question &#8220;How much is 5+2 ?&#8221;. Most of the new forum members were able to answer it on the first attempt. But spam bots had no clue.</p>
<p>So until someone bothers to write a spam bot specifically for my forums &#8211; I am okay. When it happens, I&#8217;ll just change the question. It can be many things: &#8220;What was the color of the white horse of Hammurabi?&#8221; or &#8220;How long did the six-day war lasted?&#8221; and so on. You got the point.</p>
<p>Here is how to do it.</p>
<p>In the template directory, edit profile_add_body.tpl, and add a new row the the form:</p>
<div class="dean_ch" style="white-space: wrap;"><span class="sc3"><span class="re1">&lt;tr<span class="re2">&gt;</span></span></span><br />
&nbsp; &nbsp; <span class="sc3"><span class="re1">&lt;td</span> <span class="re0">class</span>=<span class="st0">&quot;row1&quot;</span><span class="re2">&gt;</span></span><span class="sc3"><span class="re1">&lt;span</span> <span class="re0">class</span>=<span class="st0">&quot;gen&quot;</span><span class="re2">&gt;</span></span>How much is 5+2 *<span class="sc3"><span class="re1">&lt;/span<span class="re2">&gt;</span></span></span><span class="sc3"><span class="re1">&lt;/td<span class="re2">&gt;</span></span></span><br />
&nbsp; &nbsp; <span class="sc3"><span class="re1">&lt;td</span> <span class="re0">class</span>=<span class="st0">&quot;row2&quot;</span><span class="re2">&gt;</span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="sc3"><span class="re1">&lt;input</span> <span class="re0">type</span>=<span class="st0">&quot;text&quot;</span> <span class="re0">class</span>=<span class="st0">&quot;post&quot;</span> <span class="re0">style</span>=<span class="st0">&quot;width: 200px&quot;</span> <span class="re0">name</span>=<span class="st0">&quot;math_question&quot;</span> <span class="re0">size</span>=<span class="st0">&quot;6&quot;</span> <span class="re0">maxlength</span>=<span class="st0">&quot;6&quot;</span> <span class="re0">value</span>=<span class="st0">&quot;&quot;</span> <span class="re2">/&gt;</span></span><br />
&nbsp; &nbsp; <span class="sc3"><span class="re1">&lt;/td<span class="re2">&gt;</span></span></span><br />
<span class="sc3"><span class="re1">&lt;/tr<span class="re2">&gt;</span></span></span></div>
<p>Browse to the registration page on your forum to see that it looks right.</p>
<p>In includes/usercp_register.php, look around line 260, and add the condition that checks if the question was answered properly:</p>
<div class="dean_ch" style="white-space: wrap;"> &nbsp; &nbsp;<span class="kw1">else</span> <span class="kw1">if</span> <span class="br0">&#40;</span> <span class="re0">$mode</span> == <span class="st0">&#8216;register&#8217;</span> <span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span> <a href="http://www.php.net/empty"><span class="kw3">empty</span></a><span class="br0">&#40;</span><span class="re0">$username</span><span class="br0">&#41;</span> || <a href="http://www.php.net/empty"><span class="kw3">empty</span></a><span class="br0">&#40;</span><span class="re0">$new_password</span><span class="br0">&#41;</span> || <a href="http://www.php.net/empty"><span class="kw3">empty</span></a><span class="br0">&#40;</span><span class="re0">$password_confirm</span><span class="br0">&#41;</span> || <a href="http://www.php.net/empty"><span class="kw3">empty</span></a><span class="br0">&#40;</span><span class="re0">$email</span><span class="br0">&#41;</span> <span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re0">$error</span> = <span class="kw2">TRUE</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re0">$error_msg</span> .= <span class="br0">&#40;</span> <span class="br0">&#40;</span> <a href="http://www.php.net/isset"><span class="kw3">isset</span></a><span class="br0">&#40;</span><span class="re0">$error_msg</span><span class="br0">&#41;</span> <span class="br0">&#41;</span> ? <span class="st0">&#8216;&lt;br /&gt;&#8217;</span> : <span class="st0">&#8221;</span> <span class="br0">&#41;</span> . <span class="re0">$lang</span><span class="br0">&#91;</span><span class="st0">&#8216;Fields_empty&#8217;</span><span class="br0">&#93;</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span>;</p>
<p>&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> <span class="br0">&#40;</span>!<a href="http://www.php.net/isset"><span class="kw3">isset</span></a><span class="br0">&#40;</span><span class="re0">$_POST</span><span class="br0">&#91;</span><span class="st0">&#8216;math_question&#8217;</span><span class="br0">&#93;</span><span class="br0">&#41;</span> || <span class="re0">$_POST</span><span class="br0">&#91;</span><span class="st0">&#8216;math_question&#8217;</span><span class="br0">&#93;</span> != <span class="st0">&#8217;7&#8242;</span><span class="br0">&#41;</span> <span class="br0">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re0">$error</span> = <span class="kw2">TRUE</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="re0">$error_msg</span> .= <span class="br0">&#40;</span><a href="http://www.php.net/isset"><span class="kw3">isset</span></a><span class="br0">&#40;</span><span class="re0">$error_msg</span><span class="br0">&#41;</span> ? <span class="st0">&#8216;&lt;br/&gt;&#8217;</span> : <span class="st0">&#8221;</span><span class="br0">&#41;</span> . <span class="st0">&quot;Incorrect answer to the mathematical question&#8230;&quot;</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#125;</span><br />
&nbsp; &nbsp; <span class="br0">&#125;</span></div>
]]></content:encoded>
			<wfw:commentRss>http://www.thesamet.com/blog/2006/12/21/fighting-spam-on-phpbb-forums/feed/</wfw:commentRss>
		<slash:comments>174</slash:comments>
		</item>
		<item>
		<title>Tutorial: How To Implement Tagging With TurboGears and SQLAlchemy</title>
		<link>http://www.thesamet.com/blog/2006/11/17/tutorial-how-to-implement-tagging-with-turbogears-and-sqlalchemy/</link>
		<comments>http://www.thesamet.com/blog/2006/11/17/tutorial-how-to-implement-tagging-with-turbogears-and-sqlalchemy/#comments</comments>
		<pubDate>Fri, 17 Nov 2006 18:43:20 +0000</pubDate>
		<dc:creator>thesamet</dc:creator>
				<category><![CDATA[howto]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[turbogears]]></category>

		<guid isPermaLink="false">http://www.thesamet.com/blog/2006/11/17/tutorial-how-to-implement-tagging-with-turbogears-and-sqlalchemy/</guid>
		<description><![CDATA[In this tutorial I&#8217;ll show you how to create a simple, yet powerful, tagging system using SQLAlchemy with TurboGears. As the concept of tags, and social tagging in particular, have become very popular, clients now demand &#8220;tagging-enabled&#8221; applications. So, here&#8217;s &#8230; <a href="http://www.thesamet.com/blog/2006/11/17/tutorial-how-to-implement-tagging-with-turbogears-and-sqlalchemy/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><!--adsense#square-->In this tutorial I&#8217;ll show you how to create a simple, yet powerful, tagging system using SQLAlchemy with TurboGears. As the concept of tags, and social tagging in particular, have become very popular, clients now demand &#8220;tagging-enabled&#8221; applications. So, here&#8217;s a simple way to get you started.</p>
<p>Our application will associate sites with tags (many to many relationship), like delicious does, but in a much simplified manner. For instance, delicious keeps tracks of which user gave which tag to which URL. We will only associate sites with tags. But it will be very easy to add this functionality later.</p>
<p>We&#8217;ll quickstart a new project (the -s argument tells tg-admin that the project will use SQLAlchemy and not SQLObject)</p>
<div class="dean_ch" style="white-space: wrap;">
$ tg-admin quickstart -s tags<br />
Enter package name [tags]:<br />
Do you need Identity (usernames/passwords) in this project? [no] yes</div>
<h3>Defining The Model</h3>
<p>We are going to have a table for the sites, a table for the tags and a table that associates sites with tags (many-to-many). Here&#8217;s the code which defines the tables (which goes in model.py):</p>
<div class="dean_ch" style="white-space: wrap;">
sites_table = Table<span class="br0">&#40;</span><span class="st0">&#8216;sites&#8217;</span>, metadata,<br />
&nbsp; &nbsp; &nbsp; &nbsp; Column<span class="br0">&#40;</span><span class="st0">&#8216;site_id&#8217;</span>, Integer, primary_key=<span class="kw2">True</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; Column<span class="br0">&#40;</span><span class="st0">&#8216;title&#8217;</span>, Unicode<span class="br0">&#40;</span><span class="nu0">256</span><span class="br0">&#41;</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; Column<span class="br0">&#40;</span><span class="st0">&#8216;url&#8217;</span>, Unicode<span class="br0">&#40;</span><span class="nu0">1024</span><span class="br0">&#41;</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="br0">&#41;</span></p>
<p>tags_table = Table<span class="br0">&#40;</span><span class="st0">&#8216;tags&#8217;</span>, metadata,<br />
&nbsp; &nbsp; &nbsp; &nbsp; Column<span class="br0">&#40;</span><span class="st0">&#8216;tag_id&#8217;</span>, Integer, primary_key=<span class="kw2">True</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; Column<span class="br0">&#40;</span><span class="st0">&#8216;name&#8217;</span>, Unicode<span class="br0">&#40;</span><span class="nu0">32</span><span class="br0">&#41;</span>, index=<span class="st0">&#8216;tag_idx&#8217;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></p>
<p>sites_tags_table = Table<span class="br0">&#40;</span><span class="st0">&#8216;sites_tags&#8217;</span>, metadata,<br />
&nbsp; &nbsp; &nbsp; &nbsp; Column<span class="br0">&#40;</span><span class="st0">&#8216;site_id&#8217;</span>, Integer,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ForeignKey<span class="br0">&#40;</span><span class="st0">&#8216;sites.site_id&#8217;</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; primary_key=<span class="kw2">True</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; Column<span class="br0">&#40;</span><span class="st0">&quot;tag_id&quot;</span>, Integer,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ForeignKey<span class="br0">&#40;</span><span class="st0">&#8216;tags.tag_id&#8217;</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; primary_key=<span class="kw2">True</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
<p>We will now create the Python classes that correspond to these tables:</p>
<div class="dean_ch" style="white-space: wrap;">
<span class="kw1">class</span> Tag<span class="br0">&#40;</span><span class="kw2">object</span><span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; <span class="kw1">def</span> <span class="kw4">__init__</span><span class="br0">&#40;</span><span class="kw2">self</span>, name<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="me1">name</span> = name<br />
&nbsp; &nbsp; <span class="kw1">def</span> <span class="kw4">__repr__</span><span class="br0">&#40;</span><span class="kw2">self</span><span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> <span class="kw2">self</span>.<span class="me1">name</span><br />
&nbsp; &nbsp; <span class="kw1">def</span> link<span class="br0">&#40;</span><span class="kw2">self</span><span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">return</span> <span class="st0">&quot;/tags/&quot;</span>+<span class="kw2">self</span>.<span class="me1">name</span></p>
<p><span class="kw1">class</span> Site<span class="br0">&#40;</span><span class="kw2">object</span><span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; <span class="kw1">def</span> <span class="kw4">__init__</span><span class="br0">&#40;</span><span class="kw2">self</span>, url, title<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw2">self</span>.<span class="me1">url</span>, <span class="kw2">self</span>.<span class="me1">title</span> = url, title</div>
<p>Note the <code>link()</code> method in the <code>Tag</code> class. You might wonder what it does there. It just a little habbit that I wanted to share with you. I&#8217;ve found myself many times hard-coding URLs inside my templates. Then, if you want to make a tag linkable in many different places in your app, you have to hard-code the link every time. In this way, you can just pass the tag object to your template and do something like:</p>
<div class="dean_ch" style="white-space: wrap;"><span class="sc3"><span class="re1">&lt;a</span> <span class="re0">href</span>=<span class="st0">&quot;${tag.link()}&quot;</span><span class="re2">&gt;</span></span>${tag.name}<span class="sc3"><span class="re1">&lt;/a<span class="re2">&gt;</span></span></span></div>
<p>Ok, now we continue with mapping the classes to the tables:</p>
<div class="dean_ch" style="white-space: wrap;">
mapper<span class="br0">&#40;</span>Tag, tags_table<span class="br0">&#41;</span><br />
mapper<span class="br0">&#40;</span>Site, sites_table, properties = <span class="br0">&#123;</span><br />
&nbsp; &nbsp; <span class="st0">&#8216;tags&#8217;</span>: relation<span class="br0">&#40;</span>Tag, secondary=sites_tags_table, lazy=<span class="kw2">False</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; <span class="br0">&#125;</span><span class="br0">&#41;</span></div>
<p>Great. Now we can construct the database and start populating it:</p>
<div class="dean_ch" style="white-space: wrap;">
$ tg-admin sql create<br />
$ tg-admin shell<br />
&#8230;<br />
&gt;&gt;&gt; g = Site<span class="br0">&#40;</span><span class="st0">&#8216;http://www.google.com&#8217;</span>, <span class="st0">&#8216;Search engine&#8217;</span><span class="br0">&#41;</span><br />
&gt;&gt;&gt; g.<span class="me1">tags</span><br />
<span class="br0">&#91;</span><span class="br0">&#93;</span><br />
&gt;&gt;&gt; g.<span class="me1">tags</span> = <span class="br0">&#91;</span>Tag<span class="br0">&#40;</span><span class="st0">&#8216;search&#8217;</span><span class="br0">&#41;</span>, Tag<span class="br0">&#40;</span><span class="st0">&#8216;google&#8217;</span><span class="br0">&#41;</span><span class="br0">&#93;</span><br />
&gt;&gt;&gt; session.<span class="me1">save</span><span class="br0">&#40;</span>g<span class="br0">&#41;</span><br />
&gt;&gt;&gt; session.<span class="me1">flush</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
<span class="br0">&#40;</span>here SQLAlchemy echos the SQL statements it executes<span class="br0">&#41;</span></div>
<h3>Handling tags</h3>
<p>So we got the model right. The next step is to allow the users to provide tags for the site. The easiest way (for you and your users) is to ask them to enter the tags in a space-separated list. Suppose you are given this kind of space-seperated string of tags from a user, then you have to:</p>
<ul>
<li>convert all tags to lower case, in order to avoid case senstivity issues</li>
<li>check if the string contains the same tag twice</li>
<li>find which tags are already in the database and which are new</li>
<li>recover from some nonsense that users might throw at you</li>
</ul>
<p>and then get a list of Tag objects that you can assign to a site. So here&#8217;s a function that does just that:</p>
<div class="dean_ch" style="white-space: wrap;">
<span class="kw1">def</span> get_tag_list<span class="br0">&#40;</span>tags<span class="br0">&#41;</span>:<br />
&nbsp; &nbsp; <span class="st0">&quot;&quot;</span><span class="st0">&quot;Get a string of space sperated tag,<br />
&nbsp; &nbsp; and returns a list of tag objects&quot;</span><span class="st0">&quot;&quot;</span><br />
&nbsp; &nbsp; result = <span class="br0">&#91;</span><span class="br0">&#93;</span><br />
&nbsp; &nbsp; tags = tags.<span class="me1">replace</span><span class="br0">&#40;</span><span class="st0">&#8216;;&#8217;</span>,<span class="st0">&#8216; &#8216;</span><span class="br0">&#41;</span>.<span class="me1">replace</span><span class="br0">&#40;</span><span class="st0">&#8216;,&#8217;</span>,<span class="st0">&#8216; &#8216;</span><span class="br0">&#41;</span></p>
<p>&nbsp; &nbsp; tags = <span class="br0">&#91;</span>tag.<span class="me1">lower</span><span class="br0">&#40;</span><span class="br0">&#41;</span> <span class="kw1">for</span> tag <span class="kw1">in</span> tags.<span class="me1">split</span><span class="br0">&#40;</span><span class="br0">&#41;</span><span class="br0">&#93;</span><br />
&nbsp; &nbsp; tags = <span class="kw2">set</span><span class="br0">&#40;</span>tags<span class="br0">&#41;</span> &nbsp; &nbsp; &nbsp; &nbsp;<span class="co1"># no duplicates!</span><br />
&nbsp; &nbsp; <span class="kw1">if</span> <span class="st0">&#8221;</span> <span class="kw1">in</span> tags:<br />
&nbsp; &nbsp; &nbsp; &nbsp; tags.<span class="me1">remove</span><span class="br0">&#40;</span><span class="st0">&#8221;</span><span class="br0">&#41;</span></p>
<p>&nbsp; &nbsp; <span class="kw1">for</span> tag <span class="kw1">in</span> tags:<br />
&nbsp; &nbsp; &nbsp; &nbsp; tag = tag.<span class="me1">lower</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; tagobj = session.<span class="me1">query</span><span class="br0">&#40;</span>Tag<span class="br0">&#41;</span>.<span class="me1">selectfirst_by</span><span class="br0">&#40;</span>name=tag<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">if</span> tagobj <span class="kw1">is</span> <span class="kw2">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tagobj = Tag<span class="br0">&#40;</span>name=tag<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; result.<span class="me1">append</span><span class="br0">&#40;</span>tagobj<span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">return</span> result</div>
<p>So you can now easily do something like:</p>
<div class="dean_ch" style="white-space: wrap;">
&gt;&gt;&gt; f = Site<span class="br0">&#40;</span><span class="st0">&#8216;http://www.flickr.com&#8217;</span>, <span class="st0">&#8216;Flickr!&#8217;</span><span class="br0">&#41;</span><br />
&gt;&gt;&gt; f.<span class="me1">tags</span> = get_tag_list<span class="br0">&#40;</span><span class="st0">&#8216;photo sharing photograpy&#8217;</span><span class="br0">&#41;</span><br />
&gt;&gt;&gt; f.<span class="me1">tags</span><br />
<span class="br0">&#91;</span>photo, sharing, photograpy<span class="br0">&#93;</span><br />
&gt;&gt;&gt; f.<span class="me1">tags</span><span class="br0">&#91;</span><span class="nu0">0</span><span class="br0">&#93;</span>.<span class="me1">link</span><span class="br0">&#40;</span><span class="br0">&#41;</span><br />
<span class="st0">&#8216;/tags/photo&#8217;</span></div>
<h3>Tag Search</h3>
<p>It is straightforward to just list a site together with its tags:</p>
<div class="dean_ch" style="white-space: wrap;">
<span class="sc3"><span class="re1">&lt;h3</span> <span class="re0">class</span>=<span class="st0">&quot;site-title&quot;</span><span class="re2">&gt;</span></span><span class="sc3"><span class="re1">&lt;a</span> <span class="re0">href</span>=<span class="st0">&quot;${site.url}&quot;</span> <span class="re0">target</span>=<span class="st0">&quot;_blank&quot;</span><span class="re2">&gt;</span></span>${site.title}<span class="sc3"><span class="re1">&lt;/a<span class="re2">&gt;</span></span></span><span class="sc3"><span class="re1">&lt;/h3<span class="re2">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="sc3"><span class="re1">&lt;p</span> <span class="re0">class</span>=<span class="st0">&quot;site-tags&quot;</span><span class="re2">&gt;</span></span>Tags:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="sc3"><span class="re1">&lt;a</span> <span class="re0">py:for</span>=<span class="st0">&quot;tag in site.tags[:5]&quot;</span> <span class="re0">href</span>=<span class="st0">&quot;${tag.link()}&quot;</span> <span class="re0">class</span>=<span class="st0">&quot;tag&quot;</span><span class="re2">&gt;</span></span>${tag.name}<span class="sc3"><span class="re1">&lt;/a<span class="re2">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="sc3"><span class="re1">&lt;/p<span class="re2">&gt;</span></span></span></div>
<p>Search is a bit more tricky. It took me few attempts until I got the search queries right. Here&#8217;s how to fetch all sites that are tagged by &#8216;google&#8217;:</p>
<div class="dean_ch" style="white-space: wrap;">
&nbsp; &nbsp; q = session.<span class="me1">query</span><span class="br0">&#40;</span>Site<span class="br0">&#41;</span><br />
&nbsp; &nbsp; sites = q.<span class="kw3">select</span><span class="br0">&#40;</span><span class="br0">&#40;</span>Tag.<span class="me1">c</span>.<span class="me1">name</span>==<span class="st0">&#8216;google&#8217;</span><span class="br0">&#41;</span> &amp;amp; q.<span class="me1">join_to</span><span class="br0">&#40;</span><span class="st0">&#8216;tags&#8217;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
<p>the magic is mostly inside the join_to method &#8211; it stands for the SQL statements that makes sure that the Tag clause is associated to the sites. Without it, the query runs over the entire cartesian product of Sites x Tags.</p>
<p>You can make the query simpler (for MySQL; not you), if you fetch the tag_id of &#8216;google&#8217; first. Then, the query uses only 2 of the 3 tables:</p>
<div class="dean_ch" style="white-space: wrap;">
&nbsp; &nbsp; tagobj = session.<span class="me1">query</span><span class="br0">&#40;</span>Tag<span class="br0">&#41;</span>.<span class="me1">get_by</span><span class="br0">&#40;</span>name=<span class="st0">&#8216;google&#8217;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">if</span> <span class="kw1">not</span> tagobj:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="kw1">raise</span> cherrypy.<span class="me1">InternalRedirect</span><span class="br0">&#40;</span><span class="st0">&#8216;/notfound&#8217;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; sites = session.<span class="me1">query</span><span class="br0">&#40;</span>Site<span class="br0">&#41;</span>.<span class="kw3">select</span><span class="br0">&#40;</span><span class="br0">&#40;</span>sites_tags_table.<span class="me1">c</span>.<span class="me1">tag_id</span> == tagobj.<span class="me1">tag_id</span><span class="br0">&#41;</span> &amp;amp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span class="br0">&#40;</span>sites_tags_table.<span class="me1">c</span>.<span class="me1">site_id</span> == Site.<span class="me1">c</span>.<span class="me1">site_id</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
<p>To search for <code>google|photo</code>:</p>
<div class="dean_ch" style="white-space: wrap;">
q = session.<span class="me1">query</span><span class="br0">&#40;</span>Site<span class="br0">&#41;</span><br />
sites = q.<span class="kw3">select</span><span class="br0">&#40;</span><br />
&nbsp; &nbsp; Tag.<span class="me1">c</span>.<span class="me1">name</span>.<span class="me1">in_</span><span class="br0">&#40;</span><span class="st0">&#8216;google&#8217;</span>, <span class="st0">&#8216;photo&#8217;</span><span class="br0">&#41;</span> &amp;amp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; q.<span class="me1">join_to</span><span class="br0">&#40;</span><span class="st0">&#8216;tags&#8217;</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
<p>To search for <code>sharing+photos</code>:</p>
<div class="dean_ch" style="white-space: wrap;">
q = session.<span class="me1">query</span><span class="br0">&#40;</span>Site<span class="br0">&#41;</span><br />
sites = q.<span class="kw3">select</span><span class="br0">&#40;</span><br />
&nbsp; &nbsp; Tag.<span class="me1">c</span>.<span class="me1">name</span>.<span class="me1">in_</span><span class="br0">&#40;</span><span class="st0">&#8216;sharing&#8217;</span>,<span class="st0">&#8216;photos&#8217;</span><span class="br0">&#41;</span> &amp;amp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; q.<span class="me1">join_to</span><span class="br0">&#40;</span><span class="st0">&#8216;tags&#8217;</span><span class="br0">&#41;</span>,<br />
&nbsp; &nbsp; group_by=<span class="br0">&#91;</span>Site.<span class="me1">c</span>.<span class="me1">site_id</span><span class="br0">&#93;</span>,<br />
&nbsp; &nbsp; having=<span class="br0">&#40;</span>func.<span class="me1">count</span><span class="br0">&#40;</span>Site.<span class="me1">c</span>.<span class="me1">site_id</span><span class="br0">&#41;</span>==<span class="nu0">2</span><span class="br0">&#41;</span><span class="br0">&#41;</span></div>
<p>The idea is that sites that are tagged both with &#8216;sharing&#8217; and &#8216;photos&#8217; will appear twice in the select, then after grouping by site_id and getting all which appear twice, we get the desired result.</p>
<p>There are many more things that can be done from this point, like: associating with the tag-site relationship which user added the tag, rendering a tag cloud and so on. Feel free to leave comments!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thesamet.com/blog/2006/11/17/tutorial-how-to-implement-tagging-with-turbogears-and-sqlalchemy/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>How To Make IE Cache Less</title>
		<link>http://www.thesamet.com/blog/2006/07/14/making-ie-cache-less/</link>
		<comments>http://www.thesamet.com/blog/2006/07/14/making-ie-cache-less/#comments</comments>
		<pubDate>Fri, 14 Jul 2006 20:27:25 +0000</pubDate>
		<dc:creator>thesamet</dc:creator>
				<category><![CDATA[howto]]></category>
		<category><![CDATA[turbogears]]></category>

		<guid isPermaLink="false">http://www.thesamet.com/blog/2006/07/14/making-ie-cache-less/</guid>
		<description><![CDATA[Internet Explorer is known to cache the responses of GET calls. The problem occurs if your javascript functions request the same url over and over again. Internet Explorer will cache the response of the first call, and subsequent calls will &#8230; <a href="http://www.thesamet.com/blog/2006/07/14/making-ie-cache-less/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Internet Explorer is known to cache the responses of GET calls. The problem occurs if your javascript functions request the same url over and over again. Internet Explorer will cache the response of the first call, and subsequent calls will automatically return the same response, without actually contacting the server. There are two approaches to solve this problem.<br />
<span id="more-12"></span><br />
One approach could be to add a random part to the url (i.e.: <code>/poll?random=f2dee87716f</code>). So, the browser will think of it as a different URL everytime. An alternative approach could be to set the response headers correctly. The HTTP headers that need to be set are as follows:</p>
<pre><code>
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
</code></pre>
<p>Here&#8217;s how I do it in TurboGears. To each method that I don&#8217;t want IE to cache, I add the strongly_expire decorator, like this:</p>
<pre><code>
    @expose()
    @strongly_expire
    def cant_cache_me(self, position):
        ...
</code></pre>
<p>The code of the decorator, which is responsible to set the headers propertly:</p>
<pre><code>
def strongly_expire(func):
    """Decorator that sends headers that instruct browsers and proxies not to cache.
    """
    def newfunc(*args, **kwargs):
        cherrypy.response.headers['Expires'] = 'Sun, 19 Nov 1978 05:00:00 GMT'
        cherrypy.response.headers['Cache-Control'] = 'no-store, no-cache, must-revalidate, post-check=0, pre-check=0'
        cherrypy.response.headers['Pragma'] = 'no-cache'
        return func(*args, **kwargs)
    return newfunc
</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.thesamet.com/blog/2006/07/14/making-ie-cache-less/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Four Tips on Identity Testing</title>
		<link>http://www.thesamet.com/blog/2006/06/02/four-tips-on-identity-testing/</link>
		<comments>http://www.thesamet.com/blog/2006/06/02/four-tips-on-identity-testing/#comments</comments>
		<pubDate>Fri, 02 Jun 2006 15:37:51 +0000</pubDate>
		<dc:creator>thesamet</dc:creator>
				<category><![CDATA[howto]]></category>
		<category><![CDATA[turbogears]]></category>

		<guid isPermaLink="false">http://thesamet.com/blog/2006/06/02/identity-testing-technicalities/</guid>
		<description><![CDATA[I remember that day about a month ago, when it occured to me it is high time that I write some unit tests for my project. In the beginning, it felt like travelling where no man has gone before &#8211; &#8230; <a href="http://www.thesamet.com/blog/2006/06/02/four-tips-on-identity-testing/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I remember that day about a month ago, when it occured to me it is high time that I write some unit tests for my project. In the beginning, it felt like travelling where no man has gone before &#8211; a lot of technical issues to settle. I introduced my BrowserSession class to aid stateful unit testing few days ago. From responses that I got, it seems that a lot of people are still fighting with the technicalities of setting up the testing environment. So I took the time to give some hints and tips on the matter. Attached to this post is a <a href="http://thesamet.com/blog/wp-content/uploads/2006/06/idtest.zip">project which demonstrates simple identity tests</a> (that work).<br />
<span id="more-10"></span><br />
<!--adsense--></p>
<p>Before you start, it will be helpful (but absolutely not required) if you also read about the <a href="http://thesamet.com/blog/2006/05/31/testing-multi-user-turbogears-applications/">BrowserSession object</a> that is used in this post.</p>
<h2>Tip 1: Set up the test configuration right.</h2>
<p>None of your configuration files (dev.cfg, prod.cfg or app.cfg) is not loaded when you run your tests. Therefore, at the beginning of test_controllers.py, set your configuration options:</p>
<pre>
turbogears.config.update({
    'visit.on': True,
    'identity.on': True,
    'identity.failure_url': '/login',
    })
</pre>
<p>Another possibility is to create a configuration environment for testing, test.cfg and manually load it, using something like:</p>
<pre>
turbogears.update_config(configfile='project_dir/test.cfg',
                         modulename='package_name.config')
</pre>
<h2>Tip 2: Initialize your database right.</h2>
<p>testutil will set the default database to in-memory sqlite database. This looks like a good choice to me (you want something that will not be affected by previous runs and is fast). If you want to change that, make sure you do it after you import testutil. I never tried this.<br />
In any case, your tables may not exist, and it is a good time to create them once the test module is imported. You can also use this opportunity to fill the tables with some data you want to use in your tests later:</p>
<pre>
def init_database():
    # hack follows: to create a TG_User, a request
    # environment must be around...
    testutil.create_request('/')
    if TG_User.selectBy(user_name='thesamet').count() == 0:
        TG_User(user_name='thesamet', password='password',
                display_name='Nadav', email_address='spam@me.not')

    Table.create(ifNotExists=True)
    Table.create(table_number=3, table_location='far away')
</pre>
<p>Note the comment at the beginning of the function. Unless there is a cherrypy request around, you can&#8217;t create a TG_User record. This is due to some unfortunate coupling.<br />
<!--adsense--></p>
<h2>Tip 3: Identity likes your login buttons.</h2>
<p>When you emulate a user filling the login form, don&#8217;t forget to pass also the login button value (which is &#8216;Login&#8217;). If you login through the <code>/login</code> url (as opposed to give a url of some identity-protected method), do not forget to pass the forward_url argument. Example:</p>
<pre>
user = BrowserSession()
user.goto('/login?user_name=thesamet&#038;password=password' \\
          '&#038;login=Login&#038;forward_url=/')
assert user.status != 403     # not forbidden
</pre>
<h2>Tip 4: Never forget to call stopTurboGears()</h2>
<p>You must call <code>turbogears.startup.stopTurboGears()</code> at the end of <strong>each</strong> test. I won&#8217;t pretend that I understand why it is needed. It is also done in <a href="http://trac.turbogears.org/turbogears/browser/trunk/turbogears/identity/tests/test_identity.py">TurboGears&#8217; own identity tests</a>. If you don&#8217;t do that you&#8217;ll get nice exceptions from the VisitManager thread. I like to factor out the stopTurboGears() call to the tearDown part of my test case.</p>
<p>The following is an example of a test case objects that verifies that the protected resource <code>/secret</code> is well-protected:</p>
<pre>
class SecretPageTest(unittest.TestCase):
    def setUp(self):
        self.user = BrowserSession()

    def test_anonymous(self):
        self.user.goto('/')
        assert 'Login' in self.user.response
        self.user.goto('/secret')
        assert 'You must provide your credentials' \\
                  in self.user.response

    def test_bad_credentials(self):
        self.user.goto('/secret?user_name=thesamet&#038;password=incorrect'\\
                       '&#038;login=Login')
        assert 'The credentials you supplied were not correct' \\
                           in self.user.response

    def test_successful_login(self):
        self.user.goto('/secret?user_name=thesamet&#038;password=password' \\
                       '&#038;login=Login')
        assert 'This is a secret page' in self.user.response

    def test_logout(self):
        self.user.goto('/login?user_name=thesamet&#038;password=password' \\
                       '&#038;login=Login&#038;forward_url=/')
        # not forbidden
        assert self.user.status != 403
        self.user.goto('/')
        # display name should appear and no suggestion to login
        assert 'Nadav' in self.user.response
        assert '&lt;A HREF="/login"&gt;Login&lt;/A&gt;' not in self.user.response
        self.user.goto('/logout')
        self.user.goto('/')
        # no display name should appear and a suggestion to login
        assert 'Nadav' not in self.user.response
        assert '&lt;A HREF="/login"&gt;Login&lt;/A&gt;' in self.user.response

    def tearDown(self):
        turbogears.startup.stopTurboGears()
</pre>
<p>You can download the <a href="http://thesamet.com/blog/wp-content/uploads/2006/06/idtest.zip">full source code of a minimal identity testable project</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thesamet.com/blog/2006/06/02/four-tips-on-identity-testing/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>How to Deploy TurboGears Applications on BlueHost</title>
		<link>http://www.thesamet.com/blog/2006/04/15/how-to-deploy-turbogears-applications-on-bluehost/</link>
		<comments>http://www.thesamet.com/blog/2006/04/15/how-to-deploy-turbogears-applications-on-bluehost/#comments</comments>
		<pubDate>Sat, 15 Apr 2006 06:58:54 +0000</pubDate>
		<dc:creator>thesamet</dc:creator>
				<category><![CDATA[howto]]></category>
		<category><![CDATA[turbogears]]></category>

		<guid isPermaLink="false">http://thesamet.com/blog/2006/04/15/how-to-deploy-turbogears-applications-on-bluehost/</guid>
		<description><![CDATA[I have been a customer of BlueHost (a very friendly web hosting provider) since I&#8217;ve created the Python Challenge, about a year ago. A short time later, I&#8217;ve added to my account the domain thesamet.com which I use as my &#8230; <a href="http://www.thesamet.com/blog/2006/04/15/how-to-deploy-turbogears-applications-on-bluehost/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div style="float: right; width: 150px"><a href="http://www.bluehost.com/track/thesamet/code1"><img alt="Sign up to BlueHost" src="http://img.bluehost.com/180x150/1.gif" /></a></div>
<p>I have been a customer of <a title="BlueHost" href="http://www.bluehost.com/track/thesamet/blog">BlueHost</a> (a very friendly web hosting provider) since I&#8217;ve created the <a title="Python Challenge" href="http://www.pythonchallenge.com">Python Challenge</a>, about a year ago. A short time later, I&#8217;ve added to my account the domain <a title="thesamet.com" href="http://www.thesamet.com">thesamet.com</a> which I use as my personal homepage and blog. Recently, I&#8217;ve started working with TurboGears and wondered whether BlueHost could host my application. Fortunately, it was quite easy to do so.</p>
<p><b>UPDATE: BlueHost does not officially support TurboGears. Furthermore, FastCGI is much slower and difficult to work with than mod_rewrite. Therefore, I prefer to spend a few more extra bucks a month and get an hosting plan in <a href="http://www.webfaction.com?affiliate=thesamet">WebFaction</a>, where deployment of a TurboGears application is a matter of point and click.</b></p>
<p><span id="more-5"></span></p>
<p> and there is a performance penalty for using FastCGI</p>
<p>For the benefit of newcoming users, I&#8217;ve written a small shell script that does most of the boring installation chores. By following the instructions on this post, you&#8217;ll have your TurboGears applications installed behind Apache using FastCGI. I personally tested these instructions on BlueHost servers, but it is very likely that they will work for other hosting providers. Please let me know if it worked for you. Most of this work is based on the <a title="TurboGearsOnDreamHost" href="http://trac.turbogears.org/turbogears/wiki/TurboGearsOnDreamHost">TurboGearsOnDreamHost</a> page in TurboGears trac.</p>
<h4>Installing TurboGears in your account</h4>
<p>Connect to your account via ssh. If you are a Windows user, you can use  <a href="http://www.chiark.greenend.org.uk/~sgtatham/putty/">PuTTY</a> for that. Once you are logged into your account, type:</p>
<p><code> </code></p>
<pre>wget "http://www.thesamet.com/tg_scripts/tg_inst.sh"
sh ./tg_inst.sh</pre>
<p>The scripts execution takes a while. It downloads and compiles Python in your home directory and then installs latest TurboGears preview, docutils and MySQL-python. If  installation succeeds, you can skip the following section.</p>
<h4>Installation problems?</h4>
<p>Failures of <code>tg_inst.sh</code> can be due to many reasons. As the script tries to download files from remote servers, it is very likely to fail because of networking issues. Maybe trying it again can help. You can also try to delete the directory named <code>build</code> by typing</p>
<pre>rm -rf ~/build</pre>
<p>before you retry. <code>tg_inst.sh</code> logs all output to <code>~/build/stdout</code> and <code>~/build/stderr</code>. These files will contain more information about any failures.</p>
<h4>Installing your TurboGears application</h4>
<p>In this section we will configure your application to work behind Apache/FastCGI. The code examples below assume that your application name is wiki20 and you want it to be accessible through http://mydomain.com/wiki/<br />
Upload your application to any directory which is <strong>not</strong> accessible from the web, for instance to <code>/home/username/wiki20</code>. The toplevel directory contains the file prod.cfg. Start editing it by typing:</p>
<pre>nano ~/wiki20/prod.cfg</pre>
<p>You have to set the line that starts with <code>server.webpath</code> to point to the url of your application relative to the site root. In our example, we will have:</p>
<pre>server.webpath="/wiki/"</pre>
<p>We also need to create a MySQL database for your application. Login to the cPanel and choose &#8220;MySQL Databases&#8221;. Create a new database and a user, and the user to the database. Go back to prod.cfg, and uncomment the MySQL dburi string. It should look like:</p>
<pre>sqlobject.dburi="mysql://username:password@localhost/dbname"</pre>
<h4>Configuring Apache</h4>
<p>Now we create the root directory for your application:</p>
<pre>mkdir ~/www/wiki
cd ~/www/wiki
wget http://www.thesamet.com/tg_scripts/tg_fcgi.tar.gz
tar xzvf tg_fcgi.tar.gz</pre>
<p>If you want the application to be on the root of the site, just omit the <code>wiki</code> part. The last command will create two files in this directory. You have to edit the file <code>tg_fastcgi.fcgi</code> (you can use nano as before) to set few pieces of information. On the first line, change the username to your real user name. Look inside for &#8220;START USER EDIT SECTION&#8221;. You probably will only have to change wiki to your real project name. To find out your <code>code_dir</code>, you can <code>cd</code> to the folder you uploaded your application and type <code>pwd</code>.You can now try to start your application from the web. Just navigate to its root directory: http://mysite.com/wiki/tg_fastcgi.fcgi. It may take it up to 30 seconds to load. Even if it gives you an error message, try to reload the page, maybe it will work. See also the last section.<br />
The next thing we are going to do is to setup rewrite rules, so your visitors will see only clean urls. Create (or edit) the file <code>.htaccess</code> in the application site root directory by typing</p>
<pre>nano .htaccess</pre>
<p>Make sure it contains the following:</p>
<pre>RewriteEngine On
RewriteRule ^(tg_fastcgi.fcgi/.*)$ - [L]
RewriteRule ^(.*)$ /wiki/tg_fastcgi.fcgi/wiki/$1 [L]</pre>
<p>Note that the application webpath appears twice in the last line. Then change the permission of this file:</p>
<pre>chmod 755 .htaccess</pre>
<p>Try now to open a web browser to http://mysite.com/wiki/.</p>
<p>Congratulations! Your web application is now deployed.</p>
<h4>If anything goes wrong</h4>
<p>If you can&#8217;t access your application, check the log files created in its code directory. They may contain some hints.</p>
<p>If they are not even created, try to start tg_fastcgi.fcgi manually and check the logs. After you change any configuration file, it is advisable to kill all fastcgi processes by typing <code>pkill fastcgi</code><br />
If you want to check if fastcgi is running, type</p>
<pre>ps uax | grep fastcgi</pre>
<p>To see the list of fastcgi processes.</p>
<p><a href="http://www.bluehost.com/track/thesamet/code17"><img alt="Sign up to BlueHost" src="http://img.bluehost.com/468x60/10.gif" /></a></p>
<h4>Conclusion</h4>
<p>You can clean up and some some diskspace by deleting the build directory and the installation script by typing <code>rm -rf ~/build ~/tg_inst.sh</code>. I hope this information was helpful to you and you&#8217;ll have joy and success with your TurboGears application.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thesamet.com/blog/2006/04/15/how-to-deploy-turbogears-applications-on-bluehost/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

