Haystack with Xapian or Solr or PyLucene : Forums : PythonAnywhere

Haystack with Xapian or Solr or PyLucene

Hi,

Are these supported/available in PA? I want to use then. Wanted to check before i start the development.

deleted-user-61217 | 9 posts | Aug. 20, 2013, 5:29 p.m. | permalink

Not right now, but it's on the list. I've added an upvote.

giles | 12074 posts | PythonAnywhere staff | Aug. 21, 2013, 10:43 a.m. | permalink

hmm... i checked today by importing xapian package and ran a simple program. it ran fine. Looks like i will just drop haystack and live with xapian alone.

deleted-user-61217 | 9 posts | Aug. 21, 2013, 12:28 p.m. | permalink

That's odd! I don't think we installed it deliberately -- perhaps something else installed it as a dependency. I'll make a note to add a test so that we make sure it's there for future versions

giles | 12074 posts | PythonAnywhere staff | Aug. 21, 2013, 2:12 p.m. | permalink

Hey giles,

One more question.

My current implementation for searching is using db query backed with local memory cache. i want to replace this search mechanism with xapian because xapian prebuilts the index (B+ tree) and is disk based. (it maintains its index in files).

What i want to know is, is there a minimum gaurantee of the disk access speed on PA platform? or do you see any flaw in my plan?

deleted-user-61217 | 9 posts | Aug. 22, 2013, 6 p.m. | permalink

Hmm, that sounds like it would work. But the disk access speed question is a good one. We can't guarantee disk access speeds right now, unfortunately. The thing is, your disk needs to be accessible from a number of different machines -- the ones where your consoles run, the ones where your web apps run, and the ones where your scheduled tasks run. So it's networked storage, which means that access can be slow and speed can vary.

On the other hand, we're putting a lot of work into making it as fast as possible; yesterday we released an upgrade with a significant improvement (which we got by moving Dropbox syncing to a different server). And we'll be working on it more in the future.

I guess the best thing to do to get a feel as to whether it's likely to be acceptable would be to see if people advise against using networked storage for xapian indexes. We're using NFS as the transport, in case that matters.

giles | 12074 posts | PythonAnywhere staff | Aug. 22, 2013, 6:29 p.m. | permalink

isn't the same argument (networked storage) applicable to DB as well? I assume DB is accessed via network as well and the speed to access a Disk should be same as speed to access a DB, unless PA team did some thing special to opimize the access?

deleted-user-61217 | 9 posts | Aug. 30, 2013, 8:07 p.m. | permalink

I would say the main difference is the amount of data that needs to be shuffled back and forth. A database access is uaually a very small query that tells the database to send a small subset of the database back to the client. So you can get useful access to a multi-gigabyte database where you're only sending a few K each way. For filesystem-type access, you're assuming that the file is available in its entirety, so you could end up shipping gigabytes across the network to filter out a few pieces.

glenn | 9718 posts | PythonAnywhere staff | Aug. 31, 2013, 12:58 p.m. | permalink

is haystack still not supported??? going to have to cancel if so.

deleted-user-798050 | 1 post | May 22, 2015, 2:36 a.m. | permalink

You can use haystack with the Whoosh backend (since it's just Python and doesn't need a server), just not any of the others. We could probably also support Xapian soon. Solr and ElasticSearch require quite a bit more work on our infrastructure.

glenn | 9718 posts | PythonAnywhere staff | May 22, 2015, 3:07 p.m. | permalink

Any updates on the support for Haystack?

janis | 35 posts | March 17, 2016, 5 a.m. | permalink

Still only the whoosh backend I'm afraid :(

conrad | 4232 posts | PythonAnywhere staff | March 17, 2016, 1:01 p.m. | permalink

Have there been any updates on the support for Haystack, specifically solr?

deleted-user-1495514 | 4 posts | Oct. 1, 2016, 9:21 p.m. | permalink

No, there haven't.

glenn | 9718 posts | PythonAnywhere staff | Oct. 2, 2016, 11:47 a.m. | permalink

How about now? :)

deleted-user-2415668 | 1 post | Aug. 12, 2017, 10 p.m. | permalink

We will post an update when there is an update.

glenn | 9718 posts | PythonAnywhere staff | Aug. 13, 2017, 9:49 a.m. | permalink