Bug#881692: command-not-found: I re-wrote command-not-found

Julian Andres Klode jak at debian.org
Tue Nov 14 07:50:33 UTC 2017

(forwarding this to ubuntu-devel-discuss and Zygmunt)

On Mon, Nov 13, 2017 at 10:33:39PM -0800, Shawn Landden wrote:
> Package: command-not-found
> Severity: wishlist
> I re-wrote command-not-found to get rid of the python dependancy, and
> to reduce the database size, as to reduce memory usage.
> https://github.com/shawnl/command-not-found
> I was preparing to upload it to mentors as command-not-found-ng

I also rewrote it years ago, but using the same database format,
just in C. It was a lot faster. I don't understand the memory usage
bit - it should not matter how large the database is, it's memory
mapped, and not read into memory, as such memory usage should be
roughly constant. 

Questions/Comments for your approach:

* Did you test your format on a slow HDD with caches dropped? It
  must not be slower than the Python one (that one is way too slow
  already) - I did, it seems to be faster (0.4 vs 0.68 seconds)
  - I believe the database-based C rewrite was even much faster,
* update-command-not-found should use apt-get indextargets
* You don't store components, hence you cannot tell people to enable
  component. That's a very important use case for Ubuntu, where
  not all components are enabled by default, but the database is
  shipped in the package.

  You could just append /<component> to each package name I think,
  and strip it away when displaying.
* You should use getopt_long() to parse command-line options, and
  support -h, --help :)
* pts_lbsearch belongs into usr/lib/..., not usr/share/...

* You don't implement a closest matches function:

	$ command-not-found thunderbrd
	No command 'thunderbrd' found, did you mean:
	 Command 'thunderbird' from package 'thunderbird' (main)
	thunderbrd: command not found
	$ ./command-not-found thunderbrd
	thunderbrd: command not found

   This one is really important. People do make typos or misremember
   command names, so the tool needs to be able to deal with that

   Should be easy to implement though, although you might have to
   search multiple times - once for each alternative. All you need is

	def similar_words(word):
	    """ return a set with spelling1 distance alternative spellings
	        based on http://norvig.com/spell-correct.html"""
	    alphabet = 'abcdefghijklmnopqrstuvwxyz-_'
	    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
	    deletes    = [a + b[1:] for a, b in s if b]
	    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
	    replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]
	    inserts    = [a + c + b     for a, b in s for c in alphabet]
	    return set(deletes + transposes + replaces + inserts)

    And search for what that returns. And you don't need to search for those
    at all if you have a direct match.

* It needs to be translated - also very important.

* You need to Conflict with command-not-found and not Break AFAIUI

* You should not depend on grep, sed, coreutils, they are Essential.

* You do have to Depend on apt-file, as that configures apt to download
  the Contents files

* You should not have identifiers starting with _ in the program, these
  are reserved for the C implementation (like _cleanup_free_).

Yes, and these are basically the same reasons my C prototype is
not in the archive. Also, I did not put a lot of work into it, as
I was waiting for PackageKit to take that over, but that was not
done yet.

I think it's a worthwhile approach, and I can see it replacing
command-not-found if those tiny issues have been fixed. Then you
could also avoid the -ng moniker, and just take over the main
package (if Zygmunt does not mind), which also avoids a month
long NEW process :)

Debian Developer - deb.li/jak | jak-linux.org - free software dev
Ubuntu Core Developer                              de, en speaker

More information about the Ubuntu-devel-discuss mailing list