X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f X-Recipient: geda-user AT delorie DOT com X-CMAE-Analysis: v=2.4 cv=ev4acqlX c=1 sm=1 tr=0 ts=60204c9f a=+cj0cO56Fp8x7EdhTra87A==:117 a=8ZHd1Vedsh/TjeYqJD4SKQ==:17 a=9+rZDBEiDlHhcck0kWbJtElFXBc=:19 a=dLZJa+xiwSxG16/P+YVxDGlgEgI=:19 a=IkcTkHD0fZMA:10 a=qa6Q16uM49sA:10 a=a1KZgU7cAAAA:8 a=Mj1Xp5F7AAAA:8 a=Cq1Uj3AEd3N1culE-xcA:9 a=QEXdDO2ut3YA:10 a=ng0hpkU2jXKPaRTLMVYJ:22 a=OCttjWrK5_uSHO_3Hkg-:22 X-SECURESERVER-ACCT: glimrick AT epilitimus DOT com X-Original-DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=epilitimus.com; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:To:Subject:Sender: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=iFaBu3bu+yGMj8xxtSONLXtOBsKgnGw7bZLHL9DuufM=; b=eXkQl56SYzBQa3CicMCmbGe1ul /c+QUp95qL85GTGtBPDPPuYKrTyXyKAeZvtl+V40DhfTK9XSMM/1urs+N8st68WFoNwFqkd9qH/UB hZI3TSWC3tHnVEcDtWqHjtnVaCubKb+3HsRk7RdCRH8qo6p/B0hIQ6HfG9G2sqjU60xvw1zoIejBz iBpSDm3PFzVJxukM4ZMYVwoDkgBjM+v2q9Iw1frkZs0WI09h7e0QhtN6V/+qtHzlp5w9N1140R2qF JAMysNhUDqYwGMkaLRaSTXjpkgQWNG4YhpwAhgHkQrhXVKFla7wWlbvate7iQT2KRb6pGbSPa7I1u sIP2IwrQ==; Subject: Re: [geda-user] recursive subsheet issue To: geda-user AT delorie DOT com References: <0cc19ac4-217d-0897-52db-8b6fdcbbd975 AT epilitimus DOT com> From: "Glenn (glimrick AT epilitimus DOT com) [via geda-user AT delorie DOT com]" Message-ID: <68e49810-304f-3252-3201-b7a68ad16ff8@epilitimus.com> Date: Sun, 7 Feb 2021 12:24:53 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0 SeaMonkey/2.53.3 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - a2plcpnl0121.prod.iad2.secureserver.net X-AntiAbuse: Original Domain - delorie.com X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - epilitimus.com X-Get-Message-Sender-Via: a2plcpnl0121.prod.iad2.secureserver.net: authenticated_id: glimrick AT epilitimus DOT com X-Authenticated-Sender: a2plcpnl0121.prod.iad2.secureserver.net: glimrick AT epilitimus DOT com X-Source: X-Source-Args: X-Source-Dir: X-CMAE-Envelope: MS4xfBvqh8o98LF8gJMmW6p9ZQdny3HHlm0JbT0WsrqO2wx44s4RSCULd4wQoT4vWBawbeetFGbkdGEKQybNHX3zpLPGIlyGiKvPzHup1Q5xARyPpNiBy3J4 v0/lg3q+xIMxhm++BtooMKEZKbahQbX5uLeHnvnlzfFfz6h4AIkK1pt5MOoYNtH+9S1kS+QFTPVw07LgNOijpAeE9yiw8vMgsyd70Sup2+HJtK36BqndrkwW Reply-To: geda-user AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: geda-user AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk Roland Lutz wrote: > > On Sat, 6 Feb 2021, Glenn (glimrick AT epilitimus DOT com) [via > geda-user AT delorie DOT com] wrote: >> Assuming I remembered correctly how to do it you should now find 3 new >> commits on the stable-1.10 branch in my repo. > > `stable-1.10' is only for fixes which should go into the next bugfix > release; regular development usually happens on `master'. > I put it on stable-1.10 because until you merge 1.10.2 back into master I can only build stable-1.10.2. As far as I know the patch should apply and build against master just fine. But I have no way to test it. > >>> commit 32d8fb59f919c5a26b5dae90a439f45c2ed7f86f >>> Author: Glenn Pavlovic >>> Date:   Sat Feb 6 21:24:53 2021 -0800 >>> >>>     Block inifinite subsheet recursion > > A few things which I noticed when looking at the patch: > > 1) The recursion check compares the os.path.realpath of the schematic, > which is not necessarily a good idea.  As a rule of thumb, if you > don't have to mess with symbolic links in your code, don't.  (I know > gEDA/gaf violates this rule on a few instances, but this is because I > wasn't sure if I'd break anything by changing them, and shouldn't be > used as an example for new code.) > The reason for using realpath was to disambiguate a case where recursion was occurring via symbolic links, i.e. to remove symbolic links from the equation. Because the subsheet mechanism uses filenames which come from the filesystem symbolic links are something which must be taken into account.  Also see below > 2) Using the filename for comparison isn't necessary anyway since the > composite_sources field already contains Schematic objects; you can > simply compare these objects. > > 3) The current comparison is done at the top of s_traverse_sheet1, but > I think it makes more sense to do the comparison in the loop which > invokes s_traverse_sheet1.  This means top-level schematics aren't > checked, but these can't be recursive anyway. > My first reaction was: Yes they can. Any sheet can recurse to itself. I checked. Upon thinking about it further I understand what you meant. This way still seems cleaner to me. > 4) Passing a common `stack' variable as an argument to > s_traverse_sheet1 is kind of mixing paradigms: > >   a) Either `stack' is conceptually shared between s_traverse_sheet1 > invocations.  In this case, there's no need to pass it as an argument; > s_traverse_sheet1 can just use the `sheet_stack' variable.  (I'd > prefer to place the definition before the function body to make things > clearer, but this isn't strictly necessary.) > >   b) Or, s_traverse_sheet1 passes the current invocation stack to the > next level of hierarchy.  In this case, it's conceptually cleaner to > pass a copy of the variable which is discarded afterwards.  Since you > don't need to do stack.pop() in this case, the `set' type seems more > appropriate: > >     s_traverse_sheet1(sheet, set()) >     ... >     s_traverse_sheet1(subsheet, set.union(stack, {sheet_path})) > re: 2-4 Schematic objects can't be compared as if the same schematic is referenced by different paths, e.g. a symbolic link, then there are multiple objects, one for each path. The reason for using a list of filenames was so that a useful diagnostic message could be provided so the user can find the source of the problem in their file tree. Such a message requires the order in which the sheets were referenced, meaning a set cannot be used. The only alternative is to recreate the list by traversing back up the sheet chain collecting ids as you go. Since you have the filenames as you traverse downward it is cleaner to collect them as you go. Full pathnames were used as the sheet structure may not be contained in a single directory, nor even in subdirectories of the working directory. I referred to it as a stack because that is essentially how it operates. Subsheet pathnames are pushed on as they are encountered and removed as processing is completed, just like a call stack. The only deviation from a strict stack paradigm is when recursion is detected the data is traversed like a list in order to provide the diagnostic information, just like if you are outputting a stack trace. Imagine trying to find the recursion point in a complex sheet structure if all you have to go on is that recursion exists somewhere. In order to capture the top level sheet which invokes the chain which contains the recursion the 'stack' needs to be accessible at both the top level and in s_traverse_sheet1. re: 4a It was a judgement call as to whether to define it prior to s_traverse_sheet1 and use it in both there and the top level loop or define it as I did and and pass it in as a parameter. I chose the course I did because: 1) I am still getting used to the scoping rules in python and wasn't certain that a variable from outside the body of s_traverse_sheet1, but inside the class constructor (which we are in this case) would be accessible as it isn't really a 'global' (PS testing says yes), 2) I find it cleaner to define it closest to the point where the data originates, i.e. the top level loop, and then pass it along. But in thanking about I can see your approach as well. You can't do the check at the top level loop because that wouldn't check anything but the top level sheets and so recursion below the top level wouldn't be caught. The only place to catch them all is in s_traverse_sheet1. If you check inside the loop in s_traverse_sheet1 as you point out the top level sheets aren't checked and you have also done a bunch of unneeded processing. > 5) A change like this should have a regression test, i.e., a minimal > test in xorn/tests/netlist/regression/ which fails before applying the > change and passes afterwards.  This way, if someone inadvertently > re-introduces the bug, the test will fail and the error be noticed. > Agreed. I tried. I actually had a very nice test all set up and commited (a commit which I had to discard) before I discovered that the way the test framework is set up it wouldn't work. 1. Recursion sets the error flag which causes xorn to exit with a non-zero status. 2. The test framework bails as soon as it sees a non-zero status from xorn which means the output is not checked. 3. If the recursion is allowed to go uncaught it crashes with a python runtime error which also issues a non-zero status (99 instead of 1). 4. Because the error information is all on stderror the only way to really distinguish whether the error was handled correctly is to compare that output. 5. Because the diagnostic message contains path names 'run-test' couldn't just be modified to check output after a non-zero status as the output on someone elses machine would be different than on my machine. I gave up when I hit #5 If you have a solution I am open to suggestions. > > 6) Keep Python coding style in mind: > >   a) operators are preceded and followed by a space, and commas are > followed by a space I thought I did > >   b) `if' is followed by a space and doesn't have parentheses around > its condition Again I thought I did, but admittedly old C/C++ habits die hard. (PS Okay I see it) > >   c) iterating over a sequence is preferentially done using the `for > member in sequence' syntax, on in this case, `for member in > reversed(sequence)' Yep, that was my original approach, but that doesn't give you positional information, AFAIK, and parts of the diagnostic message depend on where you are in the sequence, which was the reason for the numeric 'for'. It is cleaner to compare i>0 than i >   d) an `if' statement which handles an error condition and ends with > a `return' statement or similar should usually have its condition be > true on error and not have an `else' clause Don't know about 'usually' but I can see the argument you are making, If it really bugs you I will change it. > > 7) Netlist errors should be concise and not contain line breaks.  In > this case, the important information is the name of the instantiating > schematic (which is implicitly reported by Sheet.error) and the name > of the instantiated schematic; only the latter should be included in > the error message. Generally speaking I would agree, but I think it is useful in tracking down an error like this to have the sequence. Imagine reading a stack trace where all you have are the endpoints and nothing about how you got from one to the other. If working on a large project you may be using a sheet chain that someone else produced, possibly in collaboration with some third parties, in which case you can't just go changing schematics, and it is possible that the recursion is caused by something more subtle like a mis-configured rc file. It is also possible that the actual problem is at some midpoint, e.g. an incorrect symbolic link, not the endpoints. I am loathe to assume I can foresee all the ways in which this error might occur. > > >>> commit 23b15ba475ec26fb55727af2903f17fae371c4e5 >>> Author: Glenn Pavlovic >>> Date:   Sat Feb 6 21:55:22 2021 -0800 >>> >>>     Fixed a rare, freaky bug > > Do I understand correctly that this bug can only be triggered by a > recursing subsheet, which is caught by the previous commit? > > In this case, I'd prefer to not handle this case specially.  An > invalid internal state can lead to errors, but it's the invalid > internal state which should be fixed, not the assumption that the > state is sane. Frankly I don't know. The specific case I found that causes it uses a specific combination which involves recursion, and is invalid for other reasons as well. When I remove any one of the three pieces the bug goes away. Prior to the previous commit that set of conditions would have resulted in a crash from run away recursion and so never would have reached the point where this bug would have shown up. The reason I didn't track it further back was that the code is trying to remove something from a list which isn't in the list in this situation. So the end result appears to me to be the same. Since the input which causes the issue is both already invalid, and unlikely to occur it may be safe to simply discard this commit, but then the bug would still exist. I felt it was better to provide at least something as opposed to dropping a "by the way there is a bug in gaf.netlist.netlist which happens only if you do this weird thing" on your plate. > > >>> commit 1f2de17cce2e4e97b8be14dda145155d5d8bd1ae (HEAD -> stable-1.10) >>> Author: Glenn Pavlovic >>> Date:   Sat Feb 6 22:05:27 2021 -0800 >>> >>>     Remove a cruft comment > > This comment was a result of me erring very hard on the side of not > removing comments.  I believe it is safe to remove it. Victory! :) Glenn