Closed Bug 98089 Opened 20 years ago Closed 19 years ago

license foo

Categories

(mozilla.org :: Miscellaneous, task)

task
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: scc, Assigned: scc)

References

Details

Attachments

(15 files, 25 obsolete files)

172 bytes, text/plain
Details
203 bytes, text/plain
Details
6.41 KB, text/plain
Details
6.39 KB, text/plain
Details
454.71 KB, text/plain
Details
1.43 KB, text/plain
Details
1.59 KB, text/plain
Details
1.59 KB, text/plain
Details
1.62 KB, text/plain
Details
1.62 KB, text/plain
Details
91.61 KB, text/plain
Details
1.60 KB, text/plain
Details
2.50 KB, text/plain
Details
30.00 KB, text/plain
Details
12.83 KB, text/plain
Details
 
The default goal of the reporting script is to produce a list of filenames
meeting the selected criteria (source directory, extension, license).  Then the
`replace' script takes that list, a new license, and some extra specifications
(in a format common to the reporting script) and actually modifies files. 
Splitting them into two pieces gives you the opportunity to generate your own
list, or verify/modify the one produced by the reporting script.  I haven't
finished the replace script.  For that matter, the reporting script is still in
rough shape and doesn't actually discriminate on extensions yet.  I should have
this stuff all fixed up and ready to go by Thursday.  Should this bug remain
mozilla.org confidential?  CC appropriate parties.
Status: NEW → ASSIGNED
OS: Linux → All
This bug should definitely remain mozilla.org-confidential for the moment. (Now
being on the other side of the confidentiality divide, I'm beginning to
appreciate all those Bugzilla security fixes.)

Gerv
I can't think of any good reasons why this should remain confidential. It's not
like the relicensing effort (which has been going on in public for over a year
now) is some great big secret.  
Gerv: what about this bug would be damaging if it were public?  I'm with Asa,
but there may be something I'm missing.
People are going to find out soon enough anyway; I just feel that lots of heat
and very little light will be generated, both in this bug and the newsgroups.
All of the questions in the FAQ will get asked before we are able to post it,
which will be tiresome. (And yes, I know originally FAQs were supposed to be
frequently asked.)

Also, I think it would be cool if it was a surprise ;-)

But I don't care overly much. If everyone else wants to open it, it's fine by me.

Current ETA for the first round of changes is Monday morning during tree
closure. I need to coordinate with Dawn and Scott.

Gerv
chmod +x lick ripl

...as they are both scripts you'll want to execute from the shell.  They should
be in the same directory as lutils.py.  That directory is the working directory
of the script, and paths listed for exclusion in the configuration file must be
relative to that working directory.

I still need to do some more work, particulary on getting actual license munging
right.  Gerv, these tools and the sample configuration file should be sufficient
for you to build a real configuration file describing the actions we want to
perform.  I still think my script work can be done before (Pacific) morning :-)
 That means a couple more attachments to come.
Attachment #48088 - Attachment is obsolete: true
Attachment #48832 - Attachment is obsolete: true
Attachment #48833 - Attachment description: a sample configuration file → a sample configuration file (appropriate for running lick in --report mode, since it lacks an `include license' section)
Attachment #48834 - Attachment is obsolete: true
lick now only reports files with licenses specified in the config file.  Failing
to specify any licenses allows all licenses (which is what you probaby want for
--report mode).  run lick in --list mode to produce a list of relative paths to
qualifying files suitable for input into ripl.
Gerv --- don't get worked up about the delimiter info to be specified in the
configuration file ... it's only needed in cases where we are inserting a
license into a file that previously did not textually contain a license.  ripl
uses the existing license to determine the appropriate delimiters.  For the
initial set of files to work on (only ones that explicitly contain the npl),
this isn't necessary, so we can totally omit that section from the configuration
file.
Attachment #48835 - Attachment is obsolete: true
The one vital thing missing is successfully getting date and contributor, et al,
 info from the previous license and interpolating it into the new license. 
Also, the script currently only knows to insert a license at the point where it
finds an existing license, hence, it currently can't be used to add a license to
a file that doesn't already have one (as I alluded to above).  I'll add a mode
for this soon, but my main concern is the interpolation so we can get the npl
job done.

You know, this first run might be a special case, though, where interpolation
isn't necessary ... since we're keeping the npl, and just replacing anything
else after that with the lgpl ... we could avoid interpolation for the moment. 
To make ripl exploit this tactic is but the work of a moment or two.

What do you think?  Is that the way to go (for now)?
We should actually add appropriate licenses to these scripts, and check them
into the webtools directory near the old whack_license script.
scc - I assume that, given you were up until 6am, you are now asleep. Therefore,
I won't call you to ask you about all of this. :-) Feel free to call me and
prove me wrong. 

Having read your comments, I don't think we are quite ready to roll. I'm leaving
for the week on Wednesday lunchtime so it would be really good if we could make
this happen tomorrow at 8am PST (just after tree closure.) Scott - if you want
to be the one to check this lot in, you'll need to be awake at that time. ;-) As
I see it, we do need the ability to license unlicensed files, and to strip and
interpolate the header information.

I think it's important to kill old licenses and replace them with boilerplate of
a known format; trying to LGPLise them on the fly is, I think, likely to
increase the risk of non-uniformity. And it also means we can't, or makes it
much harder to put the delimiting markers in.

I'll attach what I think is the correct config file. I can't quite work out by
reading the code whether my version of the config file will produce exactly the
output (in terms of formatting) in the sample boilerplate I sent you. This is
important, because I've already got some changed files in my local tree, and we
really need to have all the boilerplate consistent.

Gerv
For each file in the repository, find out when it was first checked
in and by who. Print out the checkin comment so we can get some clue
as to whether it was checked in for someone else.

Select dirs.dir, files.file, people.who, ci_when, descs.description from
checkins, dirs, files, people, descs where checkins.dirid=dirs.id and
checkins.fileid=files.id and checkins.whoid=people.id and
checkins.revision="1.1" and checkins.descid=descs.id order by ci_when
into outfile "/tmp/initialcheckersin2";

-rw-r--r--   1 root     other    8343916 Sep  7 17:43 initialcheckersin
this is a big file so beware of loading the whole thing with mozilla.
try using wget instead.

http://bonsai.mozilla.org/contributorlogs/initialcheckersin

- I edited the text file to combine the path and filename columns
to make it more readable

- files at the beginning of the list were all checked in by netscape people
before we started using complete email addresses so when searching for
netscape-contributed files, searching for "netscape.com" in the cvs acct name is
not enough.  It seems the change occured 1998-07-15 18:16:00

- Don't confuse 'first checked in by a netscape employee' with 'originally 
contributed by netscape.com'.

- You're looking for unlicenced files originally contributed by Netscape. First,
you need to look through this file to find the ones that were originally checked
in by netscape people. Then you need to look at the file, the package it belongs
to, and the checkin comment to see whether the netscape person only checked it 
in, and it was actually contributed by someone else.

- I was originally concernred that checkins by people who checked
in before, during and after, their netscape employment might get blurred
together and we wouldn't be able to tell which checkins happened during
their netscape employment and which ones belonged to them. However, the data
seems to suggest that changing a user name in despot creates a new account
instead of changing the name. If this is true, this would be good news for us,
but its worth paying attention to the checkin dates for people who have 
checked in both as netscape employees and not.
Dawn: are you asking me to go through that file by hand, or are you planning to
run some scripts on it?

It seems that we only need to make a judgement on one file for each checkin
comment, and that judgement applies to all other files checked in at the same
time. For example, if I decide that:
mozilla/build/build_number
ltabb
1998-03-27 18:13:00	Free the lizard
was contributed by Netscape, then all other files checked in at the same time
with Free the lizard as the checkin comment are also Netscape-contributed.

Can we leverage this to do some sort of automation?

Gerv
Bugs in/problems with lick:
Line 167: "crx:l" should be "hrc:l".

lutils.py currently can't cope with blank lines in the .conf file - it bombs out
on line 78. My Python-fu is not strong enough to fix this.

You can't use "." as the directory to work in, because the path exclusion system
isn't smart enough to recognise ./nsprpub and nsprpub as the same directory. So
you have to lick mozilla :-)

The script currently concats the current path and the filename incorrectly when
thinking about excluding paths. This can be fixed on line 25 of lutils.py by
putting a "/" in the middle of the concatenation.

I modified the original script to output numbers of files instead of
percentages. I feel this is far more useful. You may want to do the same with lick.
Gerv
Attachment #48836 - Attachment is obsolete: true
added logic to safely ignore blank lines, and corrected the path exclusion bug
by using |os.path.join|, which uses the platform appropriate path component
separators (these scripts are cross-platform, after all)

Yeah, making lick spit out absolute numbers is better, that's what I had been
doing up until brendan asked what the percentages were ... it wasn't meant to be
that way permanently, just long enough to answer his question.  I imagine people
will hack on report mode to make it answer all sorts of questions (as you
yourself have done :-)  I'll switch it back to counts anon.

I'm glad you're spotting bugs ... whatever you find, I'll jump on.  Note also,
by the way, that I have not yet enabled nobackup mode ... the actual unlink is
commented out.  This is for your own protection.
Yes, someone needs to go through by hand and evaluate each file. Creative
use of a text editor should make it easy to handle groups of similar files.

Here's how i might do it.

- When the list of questionable files is created, grep for these file names
from the master list of all checkins to create a list of questionable checkins.

- Make a copy of the questionable checkin list and remove checkins for files
which were indeed created by people who were employeed by Netscape. In the
end, the file contains only checkins contributed by others.

- Delete the files from the 'contributed by others' list from a copy of  the
'questionable files' list to create a list of unlicenced files checked in 
by netscape.
when you discover a set of related files, where the dates will be the same, you
could then ripl the list of them providing defaults for the dates and original
owner.  The alternative is something like gerv suggested on IRC, where you
annotate the list with dates ... that's more work for both of us, I _think_,
maybe some way to interleave default parameters in the file list, where large
hunks will share values....
Some background to Scott's comment: Dawn's scripts need to produce, for each
unlicensed file we decide to whack the license on, the start and end copyright
dates so we can fill in the boilerplate.

Actually, who are we kidding? It's not as if it makes any difference, is it?
Can't we just do 1998-2001 in every file, and be done with it?

Gerv
OK, after discussion between gerv, dawn and brendan, it's been decided to just
whack NPL files (and not NS-checked-in unlicensed files) tomorrow, and then go
public. We can then get more feedback on possible code bloat effects. This
should mean that Scott can do all of the required whackage without needing help
from anyone else.

Gerv
So this means all variations that have the npl in any form will be targeted? 
Across all of a default pull, except as excluded in your edited config file
attached here?  I think I'll modify ripl to also spit out a file that is the CVS
checkin command that matches the files it modified.
Yes :-)
Are you good to go? I'm heading into work now and if you aren't on IRC by then,
I'll give you a call.

Gerv
To be absolutely clear, the first stage is as follows:

Scott is to add the LGPL to the NPL in all files under the NPL or NPL/GPL in a
default pull of the Mozilla tree, excluding the following directories:

directory/c-sdk, nsprpub, security, expat

I am dealing with expat.

Gerv
Attachment #49084 - Attachment description: lutils.py --- also fixes options problems mentioned by gerv, and refactors license strings and patterns from lick → lutils.py --- refactors license strings and patterns from lick
Attachment #48838 - Attachment is obsolete: true
Attachment #48907 - Attachment is obsolete: true
cc'ing jag so he's in touch with what I've been working on
Attached file lick --- added debugging option (obsolete) —
when naming directories for lick to search, always include the terminating '/',
as well in the excluded paths in your config file.  This ensures that excluded
paths will still be excluded even if you name them on the command line.
Attachment #49087 - Attachment description: fixed another path exclusion bug, added some debugging → lutils.py --- fixed another path exclusion bug, added some debugging
Attachment #49088 - Attachment description: added debugging option → lick --- added debugging option
Attachment #49084 - Attachment is obsolete: true
Attachment #49085 - Attachment is obsolete: true
localhost% ./lick -arc lick.config mozilla/
mozilla/:
 10708  38.80% unknown
  3188  11.55% mpl
 11456  41.51% npl
     3   0.01% mpl/npl
    19   0.07% gpl
  2143   7.76% mpl/gpl
     9   0.03% npl/gpl
     1   0.00% mpl/npl/gpl
    68   0.25% lgpl
     3   0.01% mpl/lgpl
     3   0.01% gpl/lgpl
 27601 total files examined

...using the config file supplied by Gerv, above... which means I'll be fixing
the following files

 11456  41.51% npl
     3   0.01% mpl/npl
     9   0.03% npl/gpl
     1   0.00% mpl/npl/gpl

The 13 non npl-only files may be due to overzealousness on the part of licks
license determination policy.  I'll do some name spitting runs and attach the
files here.
I can see from the output that excluded paths still aren't doing the right thing...

localhost% ./lick -lc find.config mozilla/
mozilla/LICENSE
mozilla/nsprpub/pr/src/md/mac/macio.c
mozilla/security/manager/makefile.win
mozilla/security/manager/pki/makefile.win
mozilla/security/manager/pki/public/Makefile.in
mozilla/security/manager/pki/public/makefile.win
mozilla/security/manager/pki/resources/makefile.win
mozilla/security/manager/ssl/makefile.win
mozilla/security/manager/ssl/public/Makefile.in
mozilla/security/manager/ssl/public/makefile.win
mozilla/security/manager/ssl/resources/makefile.win
mozilla/extensions/xmlterm/doc/MPL

...this is partial output from a run looking only for the license types
responsible for the 13 `odd' files discovered in previous comment. 
mozilla/security/ should be excluded, but you can see it hasn't been.  Back to
debugging...
Attached file lutils.py --- more exclusion fixes (obsolete) —
Attachment #49087 - Attachment is obsolete: true
ok ... ignore that stuff about appending '/'.  Make sure you _don't_ append a
slash on the exclude paths.
Attachment #49103 - Attachment description: ./lick -lc lick.config mozilla → ./lick -lc lick.config mozilla # a list of the files eligible for re-licensing
Attachment #49103 - Attachment is obsolete: true
Oops.  Gerv didn't update the config file to match the latest decision of which
directories to exclude.  Previous list is no good.
localhost% ./lick -rc lick.config mozilla 
mozilla:
  9876 100.00% npl
  9876 total files examined

...so all of the files found (names listed in the attachment above) seem to npl
only.
Expanding the set of licenses to every combination that includes the npl:

  # include licenses
  npl       
  mpl/npl
  npl/gpl
  mpl/npl/gpl 
  npl/lgpl
  mpl/npl/lgpl
  npl/gpl/lgpl 
  mpl/npl/gpl/lgpl


localhost% ./lick -rc lick.config mozilla
mozilla:
  9876  99.97% npl
     3   0.03% mpl/npl
  9879 total files examined

changing lick.config to include only mpl/npl...

localhost% ./lick -lc lick.config mozilla 
mozilla/LICENSE
mozilla/extensions/xmlterm/doc/MPL
mozilla/xpinstall/wizard/unix/src2/MPL-1.1.txt

...so it looks like all the files that weren't excluded by path or by dir are
straight npl.  Gerv: how does this list look to you?
localhost% ./lick -arc lick.config mozilla
mozilla:
  8941  40.80% unknown
  2818  12.86% mpl
  9489  43.30% npl
     3   0.01% mpl/npl
    16   0.07% gpl
   188   0.86% mpl/gpl
   387   1.77% npl/gpl
    67   0.31% lgpl
     2   0.01% mpl/lgpl
     3   0.01% gpl/lgpl
 21914 total files examined

localhost% ./lick -rc lick.config mozilla
mozilla:
  9489  96.05% npl
     3   0.03% mpl/npl
   387   3.92% npl/gpl
  9879 total files examined

...so the distribution was wrong.  The three mpl/npl files are the same as
before.  So the list generated, I assume, will be the same, but I'll compare,
just in case.
the lists match
Attachment #49102 - Attachment is obsolete: true
the template needs editing to use only one non-comment-delimited form, and to
contain the `blanks' ripl understands how to fill in
Attachment #49128 - Attachment is obsolete: true
One tiny nit-pick: My boilerplates have the "All" of "All Rights Reserved" on
the following line. I thought this was neater at the time.

Gerv
Attachment #49130 - Attachment is obsolete: true
Attachment #48843 - Attachment is obsolete: true
well, I'm typing into this bug using a version of the app I built in a tree
where I re-licensed 6800 files.
Surprise! :-)

(making not mozilla.org-confidential)

Gerv
Group: mozillaorgconfidential?
yay for "open" source projects. </sarcasm>
Hixie: we've been working on this for a year and a half. How exactly have we
been doing it behind closed doors? I asked for the bug to be kept closed merely
because otherwise it would get filled with rubbish; but we did have Yet Another
Licensing Discussion in the newsgroups, feedback from which was incorporated
into these changes.

Gerv
Blocks: 82339
be in the directory containing the mozilla directory, have the latest versions
of lick, ripl, and lutils.py in this directory as well.

  ./lick -lc c-like.config some_directory/ > ripl.all-input
  head -n 200 ripl.all-input > ripl.input
  ./ripl < ripl.input
  # look at the diffs to see if anything is worth not checking in for
  cd mozilla
  cvs commit -m "bug #98089: ripped new license" some_directory/
  # repeat from the 'head' command till you've checked in everything
Are you sure you want sequences of '-' characters in xml comments?

http://lxr.mozilla.org/seamonkey/source/xpfe/components/autocomplete/resources/locale/en-US/contents.rdf#2

<!-- ----- BEGIN LICENSE BLOCK -----

Note the "must not" in http://www.w3.org/TR/REC-xml#sec-comments :

> 2.5 Comments
> 
> [...] For compatibility, the string "--" (double-hyphen) must not occur within
> comments. [...]
> 
> [15] Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'
If I read the rule correctly a '-' can only be followed or preceded by a Char
(which I guess doesn't contain '-').

So something like

<!-- - - - - - BEGIN LICENSE BLOCK - - - - - -->

or

<!-- ===== BEGIN LICENSE BLOCK ===== -->

would work.
The second alternative with '=' characters is bad, too, because some people
search for '===' as an indicator of cvs conflicts. See e.g. bug 95759.
* or _ maybe?

... I use <<< === >>> while searching for merge conflicts (did that a few hours 
ago, so any of those would be *very* bad.
Speaking of license whacking, please say that your script will whack:
http://lxr.mozilla.org/seamonkey/search?string=oqr [many hits]
oh and
http://lxr.mozilla.org/seamonkey/search?string=Contributers [1 hit -- stamp out 
and reduce the risk of error inducing singletons]

And does anyone here see a need to retain 'express' instead of 'expressed' 
(which has better parallelism and is the dominant form according to google)?
Timeless, again you show your remarkable ability to write a comment which, at
first reading, makes absolutely no sense whatsoever.

The point about hyphens is well made - the original idea was to copy the strings
from PGP. I think it would be much easier if we just went for

BEGIN LICENSE BLOCK

unadorned. Does anyone see a problem with that?

Of course, we could use -*- ;-)

Gerv
> Of course, we could use -*- ;-)

As far as I know, -*- ... -*- are used for emacs modelines already.

What about "+"? In my view, using either "+" or "*" is nicer then the unadorned
version:
	+++++ BEGIN LICENSE BLOCK +++++
	
	***** BEGIN LICENSE BLOCK *****
 
But then, "nice" may not be the most important criterion here. The problem with
underscores (_____ BEGIN LICENSE BLOCK _____) is that they are already used to
mark places where something needs to be filled in.
If we can't copy PGP, I see no reason to bloat every file an extra 24 bytes
unnecessarily. Beauty is not a criterion here. :-)

BTW, I was joking about -*- :-)

Gerv
> BTW, I was joking about -*- :-)

Yes, I saw that. :-) My point was that even some jokes are not a good choice...
Attachment #49088 - Attachment is obsolete: true
Attachment #49120 - Attachment is obsolete: true
Attachment #49137 - Attachment is obsolete: true
Attachment #49889 - Attachment is obsolete: true
Attachment #49890 - Attachment is obsolete: true
Attachment #49895 - Attachment is obsolete: true
Attachment #50242 - Attachment is obsolete: true
Attachment #50246 - Attachment is obsolete: true
Attachment #50946 - Attachment is obsolete: true
Attachment #57939 - Attachment is obsolete: true
Gerv,

The attachment above is a tar file.  Save it locally (as, say, ripl.tar) and
then extract the contents with

  tar -xf ripl.tar

Make sure you set execute permissions on lick and ripl with

  chmod +x lick ripl

You'll need to generate the list of appropriately licensed xml-like files for
ripl to work on, by modifying your previous config file and running lick (as in
previous runs).  Then submit this list to ripl for relicensing as usual.

I'm still testing, you should be as well.
this re-license script bug seems to be of no further utility
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
rsv
Status: RESOLVED → VERIFIED
Would you consider relicensing some of the old classic code similarly?

There are a lot of useful obsolete files in the "classic" tree.  Specifically,
the old xfe stuff.  Maybe not to Mozilla, but to others.

Is there any way these files could have this extra GPL/LGPL compatability stuff
added and somehow re-released?  This would be a win for some GPL/LGPL projects
that are still stuck using obsolete UI toolkits.
tringali: can you give some more concrete examples?

Gerv
Hi Gerv,

I (and others) maintain a GPL'd program (http://www.nedit.org).  Of that, a
bunch of people have contributed changes against that, in which they own the
copyright to under the GPL.  

NEdit has always been a Motif editor.  It would take a ground-up rewrite to host
it to a new toolkit.  One of our aims is that anyone on a commerical Unix system
can compile it without having to go get any other libraries or have root
privileges.  Since most commerical Unixes ship with Motif already, the user
doesn't have to do anything extra to compile it.  This makes it very easy to
install and use.  Changing toolkits is not an option, as it would kill our
current niche.

But Motif is dead.  To improve NEdit, we would like to consider linking against
some NPL code.  Specifically, the XmL library, but in general, all the Motif
widgets look useful and might be so in the future.  Being a long time
Netscape/Unix user, the old xfe code has a lot of good Motif stuff that is well
tested and works.  If all of that code would be made GPL-compatible by
relicensing it, then we could link against it.

If the XmL code was not relicensed, we do have another option.  According to the
FSF, we would have amend NEdit's license to allow end users to link the GPL
modules with the NPL modules.  Of course, it would be illogical and stupid to
make a source distribution that the users were not allowed to to link together,
but licensing sometimes is that way.

For this, we would have to get the explicit permission of all the copyright
holders of everyone who has ever contributed to nedit.  Not an impossible task,
but it does have a problem: a single person could block the entire process by
saying no, or (worse) not responding.  Perhaps for the "good of the world" this
old obsolete code could be opened up for the GPL and LGPL use?

After looking at the XmL source files, it does not appear to be that they have
been relicensed (as in your relicensing FAQ).  We are unsure if:

1) The files were intentionally not changed because they cannot be so.
2) The files were relicensed, but nobody bothered to update the old files.
3) The files were not relicensed, but Mozilla is amenable to doing so if there
is some need.

...etc.  So we figured we'd ask!
Thanks.
You need to log in before you can comment on or make changes to this bug.