Open Thoughts

Swamped in R-CRAN updates

Posted by Soeren Sonnenburg on May 24, 2011

It seems like the regular updates of packages in R-CRAN are starting to hide the manually updated packages on mloss.org. We are therefore only updating R-CRAN packages once per week (instead of daily as we used to).

I hope this gets your packages increased visibility again.

Comments

Helmut Muelner (on May 24, 2011, 13:42:34)

It would be even better if the r-cran-robot worked correctly. It should only send notifications if a package really was updated, which can be recognized easily ("Published: yyyy-mm-dd).

BTW: Your sign-up form said: Last Name: Last name can only contain letters, numbers and underscores

when I tried to enter my name (Mülner) with u-umlaut, which is a letter in most parts of the world.

Zeno Gantner (on May 24, 2011, 17:42:32)

Thank you for fixing this.

What would also be nice: Actually display what has been changed in the packages (not only that the information has been crawled by a bot).

Soeren Sonnenburg (on May 24, 2011, 21:04:45)

Helmut, I don't see that this fixes the issue: We are parsing all machine learning packages' description files, e.g.

http://cran.r-project.org/web/packages/rminer/DESCRIPTION

and while there is a Date/Publication in there it is always even newer than the packaged date or date.

But if you can come up with something clever - the script is in the mloss.org source code (available from mloss.org :) in

mloss/cran/update_cran.py

Regarding umlauts, see https://mloss.org/faq/ number 8.

Zeno, I don't see how we could extract that information - but would be happy to include it.

Soeren

Helmut Muelner (on May 25, 2011, 13:53:53)

The rminer DESCRIPTION contains: Packaged: 2011-04-25 14:10:56 UTC; root Repository: CRAN Date/Publication: 2011-04-25 17:47:31

The TWIX DESCRIPTION contains: Packaged: 2009-11-02 18:41:42 UTC; sergejpotapov Repository: CRAN Date/Publication: 2009-11-03 11:23:53

In both cases the Packaged and Date/Publication dates are ok and in the past.

Although I know only a little python, I had a look at your update script and would suggest the following changes:

  • The time format string in line 51 should be "%Y-%m-%d %H:%M:%S UTC", or the split argument in line 160 should be changed from ';' to ' '
  • lines 154 and 160 should be interchanged, so the check for "Packaged" comes first.

Concerning umlauts: You could at least allow Latin-1 characters.

Helmut Muelner (on May 30, 2011, 14:18:23)

BTW the RSS feed (http://mloss.org/software/rss/latest works correctly, e.g. on May 26 it only showed r-cran-caret 4.89.

Helmut Muelner (on June 6, 2011, 11:39:42)

Correction: Now the RSS feed also is faulty.

Helmut Muelner (on June 6, 2011, 11:45:39)

Another BTW: The link to your homepage http://ida.first.fraunhofer.de/~sonne does not work.

Leave a comment

You must be logged in to post comments.