Bug#881692: command-not-found: I re-wrote command-not-found
Julian Andres Klode
jak at debian.org
Tue Nov 14 07:50:33 UTC 2017
(forwarding this to ubuntu-devel-discuss and Zygmunt)
On Mon, Nov 13, 2017 at 10:33:39PM -0800, Shawn Landden wrote:
> Package: command-not-found
> Severity: wishlist
>
> I re-wrote command-not-found to get rid of the python dependancy, and
> to reduce the database size, as to reduce memory usage.
>
> https://github.com/shawnl/command-not-found
>
> I was preparing to upload it to mentors as command-not-found-ng
I also rewrote it years ago, but using the same database format,
just in C. It was a lot faster. I don't understand the memory usage
bit - it should not matter how large the database is, it's memory
mapped, and not read into memory, as such memory usage should be
roughly constant.
Questions/Comments for your approach:
* Did you test your format on a slow HDD with caches dropped? It
must not be slower than the Python one (that one is way too slow
already) - I did, it seems to be faster (0.4 vs 0.68 seconds)
- I believe the database-based C rewrite was even much faster,
though.
* update-command-not-found should use apt-get indextargets
* You don't store components, hence you cannot tell people to enable
component. That's a very important use case for Ubuntu, where
not all components are enabled by default, but the database is
shipped in the package.
You could just append /<component> to each package name I think,
and strip it away when displaying.
* You should use getopt_long() to parse command-line options, and
support -h, --help :)
* pts_lbsearch belongs into usr/lib/..., not usr/share/...
* You don't implement a closest matches function:
$ command-not-found thunderbrd
No command 'thunderbrd' found, did you mean:
Command 'thunderbird' from package 'thunderbird' (main)
thunderbrd: command not found
$ ./command-not-found thunderbrd
thunderbrd: command not found
This one is really important. People do make typos or misremember
command names, so the tool needs to be able to deal with that
Should be easy to implement though, although you might have to
search multiple times - once for each alternative. All you need is
def similar_words(word):
""" return a set with spelling1 distance alternative spellings
based on http://norvig.com/spell-correct.html"""
alphabet = 'abcdefghijklmnopqrstuvwxyz-_'
s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [a + b[1:] for a, b in s if b]
transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
replaces = [a + c + b[1:] for a, b in s for c in alphabet if b]
inserts = [a + c + b for a, b in s for c in alphabet]
return set(deletes + transposes + replaces + inserts)
And search for what that returns. And you don't need to search for those
at all if you have a direct match.
* It needs to be translated - also very important.
* You need to Conflict with command-not-found and not Break AFAIUI
* You should not depend on grep, sed, coreutils, they are Essential.
* You do have to Depend on apt-file, as that configures apt to download
the Contents files
* You should not have identifiers starting with _ in the program, these
are reserved for the C implementation (like _cleanup_free_).
Yes, and these are basically the same reasons my C prototype is
not in the archive. Also, I did not put a lot of work into it, as
I was waiting for PackageKit to take that over, but that was not
done yet.
I think it's a worthwhile approach, and I can see it replacing
command-not-found if those tiny issues have been fixed. Then you
could also avoid the -ng moniker, and just take over the main
package (if Zygmunt does not mind), which also avoids a month
long NEW process :)
--
Debian Developer - deb.li/jak | jak-linux.org - free software dev
Ubuntu Core Developer de, en speaker
More information about the Ubuntu-devel-discuss
mailing list