-
"I think I know, I don't think I know, I don't think I think I know, I don't think I think." - Ed (Cowboy Bebop)
March 2010
openSUSE Search
During last two week I was among the other things investigating how to implement search through all openSUSE web pages. As part of our Umbrella project, we want to make all our webs look unified and search through all of them is a part of this goal. So what I tried and what are my conclusions? Let's see...
Customized Google Search
First idea would be to use Google search. They are offering customized search for anyones web. They are good at searching and we wouldn't need implement anything by ourself. Other upside of this approach is that we wouldn't need any infrastructure for this. They will let us use their machines.
But it has some downsides as well. One minor thing is that it will index all our sites regardless of their content. So wiki pages may come up less relevant then a comment in someones blog. I don't think we really want this.
The major downside as I see it is the agreement. I'm not a lawyer, but I didn't like it. Lets say that request that we have to display some advertisements and Google preferred links is ok. But according to the agreement we can't customize the results as much as we want, we can just provide some theme and they may use it somehow (no details in agreement). Other thing I found quite a disturbing is that they can use our logo and trademarks forever to promote their products. Well, I don't think we mind right now, but forever is a long time. Maybe I just didn't understood the agreement well, but I'm quite sure that we don't want to use it without discussions with some skilled lawyers.
At last but not least, as an Open Source community we should try to go for some Open solution. So I decided to check some Open Source engines available on the internet...
Open Source Solutions
I took a brief look at Xyzse and Swish++. Disadvantages I found were that last versions seemed to be released somewhere in 2008. This doesn't have to be bad, but I think that something more alive may be better. And Other thing I didn't liked was that it seemed like I need to hardcode some search limitations during compilation of the packages (at least it looked like that from installation instruction that required to edit some headers manually).
YaCy
YaCy is a really interesting search engine. It's main innovative idea is that it is decentralized. You just run one peer, connect it to the network and then search across all peers in that network. You don't need any big server, you can make it work with just everybody indexing his own web. Really interesting idea. One small thing I personally didn't like was that it is in Java (I don't really speak Java). It was quite easy to try it as it started it's own web server, but it looked like it wouldn't be easy to customize it. It would be great to use it for my own webpages, but I think we want something else for openSUSE.
Datapark Search Engine
Last search engine I want to speak about is Datapark Search Engine. It is Open Source engine and it is written in C. For storage of the data it can use MySQL, PostgreSQL or SQLite. It can be used as a cgi on web, as an apache module or through it's php bindings. Results page is highly customizable. It's just a HTML template that gets read and filled with results. So it wouldn't be any problem to create Bento theme for it and make it integrated with the rest of our webs.
Other interesting feature is that it allows to tag all servers and create hierarchical category list to make searching on some part of our infrastructure easier. Didn't tried this feature yet, but I think we can use it. We can also add some extra points to the most relevant webs (I think wiki deserves this).
Last very interesting feature is that it can index pretty much anything. It doesn't have to be only web pages. Everybody can write its own plugin that knows how to handle some specialized format. If I want to be able to search among the rpms on the Build Service, I can write easy filter to make it possible. And then during the search for MySQL I wouldn't see only Wiki pages dedicated to the MySQL and related blogposts but also rpms of MySQL itself. Pretty interesting, isn't it? I'm not really sure whether we want this, but we can do it with this search engine ![]()
Conclusion
I think we should use Datapark Search Engine. Because it's Open Source, it has categories and tags, it can add extra points to sites we like and it's highly customizable. If I missed something interesting we should evaluate, please let me know. There are many interesting projects out there and I tried only few of them. Although I think I found what I was looking for, any comments are welcome as well as any suggestions...
MySQL Version Updates
Few weeks ago I was at FOSDEM. It was really amazing experience. I meet many interesting people, learned quite some thing and I returned full of enthusiasm. Open Source events are really great.
But all the fun wasn't over even after the FOSDEM. I spent few more days in Bruxelles attending MySQL packagers meeting organized by SUN/Oracle. We spent quite some time talking to each other. We learned what MySQL people are doing and how. And they learned how do we deal with MySQL and what is troubling us. And many good things will come from this.
First but certainly not last of them is about to appear now. One very interesting thing we learned at meeting was MySQL release policy. What openSUSE and Ubuntu and maybe some others are doing is that after release date there is generaly no version updates allowed. We are only fixing serious bugs and security related issues. It takes quite some work. What we learned is that new releases in stable branch of MySQL are in fact maintanance updates. If you update from 5.1.43 to 5.1.44 you wouldn't get any new features. All you will get are bugfixes. And only bugfixes of serious or security related issues. Does it sound familiar? Yes it is the same thing we are doing! So I discussed it with our maintanance team. And we came to the conclusion that we want to give our users all serious fixes. Not only these few selected. And the best way to do it is to use maintanance updates provided by MySQL people themself. I'm not saying that I don't have enough confidence to play with MySQL sources, but I think that MySQL people can do it better ![]()
Yes, you are guessing right. What I'm trying to say is that we are going to update MySQL to the latest available version. This means 5.1.44 for openSUSE 11.2 and 5.0.90 for older openSUSE. We will start with 11.2 as version gap is smaller there and if everything will proceed smoothly, we will continue with 11.1 and 11.0. For 11.2 you can help by testing update. Currently 5.1.44 update is prepared for 11.2 in server: database: STABLE and I'm running some final tests. If you want, you can try it too (not recomended on production servers yet) and if you'll find any problems, please report them before it will hit official updates.
Remember, this is just the beginning. I've got some bigger plans regarding MySQL in 11.3 ![]()
Search
Blog Archive
Identica
- Announcing start of !openSUSE pastebin: http://bit.ly/avru2V Do you want pastebin with !openSUSE Bento theme? Try http://susepaste.org8 hours ago :: link
- ♻ @C3rvajz: !openSUSE Live! is back! Now accessible through native IPv6 protocol :) openSUSE 11.3 is ready here http://live.opensuse.org/13 hours ago :: link
- @lentulius You can use zypper, which is even better (one easy tool instead of several complicated ones ;-) )3 days ago :: link
- I hate computers. They pretend to be obedient, predictible and logical, but they know how to play a prank!4 days ago :: link
- Michal is scared of all these e-mails he has to read and write now :-(11 days ago :: link
Ohloh Journal for Michal Hrušecký
- Article about Affisix was acceptet for ITAT 2010! Time to fix everything that reviewers complained about!36 days ago :: link
- Abstract for ITAT2020 about Affisix sent, working on the article now...66 days ago :: link
- Just commited basit filter mode support to the Affisix repository. It is still missing a lot of features, but the basics are already there!92 days ago :: link
- One more note about affisix - backward entropy works now as well. So after school exams, I can start implementing new features and fixing broken ones...159 days ago :: link
- Yesterday I finally convinced affisix to do something after the change of internals. Forward entropy works, everything else is broken, but it's just a beginning ;-) Hopefully new features will emerge soon.161 days ago :: link







