delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2021/04/14/13:14:31

X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 293473955422
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1618420466;
bh=nIHngXsTh88qfQUQzbTSCH6xYubkXuoW0ujoWDbwtRY=;
h=To:References:In-Reply-To:Subject:Date:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=OiIMYaBsvLqRaBcyGb7Qr7Amtv5KycZBIW7ojmk86Ye2rhI9j+WtroDd4QTtd7GgV
rjGhgX939lDrT84a5C5W3lVwcasW4T5B/P3ekpD0mE9+cGUtuFqL/MG3VDkVMvPZOt
06UnYhl3XdhpSxsCSDbaD11YO23VTslpGSUPPIDM=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C90DE3955422
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:from:to:references:in-reply-to:subject:date
:message-id:mime-version:content-transfer-encoding:content-language
:thread-index;
bh=FBVAvybgUZTjUkdrE7H46Yc3hAA3s0cc47UyAmOfOsE=;
b=iH+33jez5vsmn/oU09Y4MhUiKXfZSzCfWO2sRSlG9Sp8JwLupEw92pBliqt0HxK2H+
D12HlnIy3NTyRhs7lE2IzEE8D9/jaotsiFoCGfNex5VJqLfgcYexgYgzC+O6UY1l+UzS
ubQFLee/57FHPFG3YPh1NvINupPw5GHGe4c496PMBNH+fmsQjggn1TZSY8uHWzu36dmo
Y2Nn9TG1PB4E07o3cRak9WHx/dDhhpGMmhJftWSGGRk+8tQIodgOpLe6FlsbS/DUWuNH
5CQx7Fj1N0JFZNa3X1zGt5aAZVb/S7Yd4aQQsST/o9e26q7pFDErTT2xi5sSZU7Oob0B
QQOw==
X-Gm-Message-State: AOAM5330DDaSFTdsyuQZcZPE3VtBGDixeTF78sghuJZ7mVlECPoJcQ9n
pgnv03HSZPMaWs2WSnm0d83W9Nhh6zZ4dw==
X-Google-Smtp-Source: ABdhPJxxCHE9SWlfaBrDTgvifBSblf8bt7ifQhrpDR+5SmhDue59dJImwiFCuv7eKZkPA5us6PHBcA==
X-Received: by 2002:a05:6512:1086:: with SMTP id
j6mr23522774lfg.462.1618420461426;
Wed, 14 Apr 2021 10:14:21 -0700 (PDT)
To: "'Ken Brown'" <kbrown AT cornell DOT edu>, <cygwin AT cygwin DOT com>
References: <04cc01d71ffa$7d1e6cf0$775b46d0$@gmail.com>
<YFo/fFC2bITvnVGr AT xps13> <00d901d7208e$97c05c50$c74114f0$@gmail.com>
<860668bf-8cf9-0969-6a01-7fbf8b782db1 AT cornell DOT edu>
<000901d72607$55dc5a90$01950fb0$@gmail.com>
<3346cd1c-b93f-83c4-ff26-553ac95ec692 AT cornell DOT edu>
<7c21a430-9609-7fd4-1a02-8b7c1978d2f8 AT cornell DOT edu>
<001901d72af4$4009cd50$c01d67f0$@gmail.com>
<134074c1-4c0b-0842-b88b-536a1ed4aefe AT cornell DOT edu>
<000e01d7306e$3c265580$b4730080$@gmail.com>
<19cf8626-c653-76db-a409-730a5aa5c955 AT cornell DOT edu>
<4380cdea-c95b-d9dc-50e3-e5adabb73b92 AT cornell DOT edu>
In-Reply-To: <4380cdea-c95b-d9dc-50e3-e5adabb73b92@cornell.edu>
Subject: RE: AF_UNIX/SOCK_DGRAM is dropping messages
Date: Wed, 14 Apr 2021 19:14:20 +0200
Message-ID: <000701d73151$9c259660$d470c320$@gmail.com>
MIME-Version: 1.0
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQIPffBCgY7dkx32YYBd4buxXBOzegICwCl2At957CQCAh4QbgK/qZQ0Aiflzi4DDsW9ugMOPnyiAg8iLcEB0zQOqwLLQMAbAeO3FIypb9NuoA==
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, DKIM_SIGNED,
DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,
SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_SBL,
URIBL_SBL_A autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Kristian Ivarsson via Cygwin <cygwin AT cygwin DOT com>
Reply-To: sten DOT kristian DOT ivarsson AT gmail DOT com
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 13EHEUxo023898

> >> Hi Ken
> >>
> >>>>>>>>>>> Using AF_UNIX/SOCK_DGRAM with current version (3.2.0)
> seems
> >>> to
> >>>>>>>>>>> drop messages or at least they are not received in the same
> >>>>>>>>>>> order they are  sent
> >>>>>>>
> >>>>>>> [snip]
> >>>>>>>
> >>>>>>>> Thanks for the test case.  I can confirm the problem.  I'm not
> >>>>>>>> familiar enough with the current AF_UNIX implementation to
> >>>>>>>> debug this easily.  I'd rather spend my time on the new
> >>>>>>>> implementation (on the topic/af_unix branch).  It turns out
> >>>>>>>> that your test case fails there too, but in a completely
> >>>>>>>> different way, due to a bug in sendto for datagrams.  I'll see
> >>>>>>>> if I can fix that bug and then try again.
> >>>>>>>>
> >>>>>>>> Ken
> >>>>>>>
> >>>>>>> Ok, too bad it wasn't our own code base but good that the
> "mystery"
> >>>>>>> is verified
> >>>>>>>
> >>>>>>> I finally succeed to build topic/af_unix (after finding out what
> >>>>>>> version of zlib was needed), but not with -D__WITH_AF_UNIX to
> >>>>>>> CXXFLAGS though and thus I haven’t tested it yet
> >>>>>>>
> >>>>>>> Is it sufficient to add the define to the "main" Makefile or do
> >>>>>>> you have to add it to all the Makefile:s ? I guess I can find
> >>>>>>> out though
> >>>>>>
> >>>>>> I do it on the configure line, like this:
> >>>>>>
> >>>>>>     ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --
> >>> prefix=...
> >>>>>>
> >>>>>>> Is topic/af_unix fairly up to date with master branch ?
> >>>>>>
> >>>>>> Yes, I periodically cherry-pick commits from master to topic/af_unix.
> >>>>>> I'lldo that again right now.
> >>>>>>
> >>>>>>> Either way, I'll be glad to help out testing topic/af_unix
> >>>>>>
> >>>>>> Thanks!
> >>>>>
> >>>>> I've now pushed a fix for that sendto bug, and your test case runs
> >>>>> without error on the topic/af_unix branch.
> >>>>
> >>>> It seems like the test-case do work now with topic/af_unix in
> >>>> blocking mode, but when using non-blocking (with MSG_DONTWAIT)
> >>>> there are
> >>> some
> >>>> issues I think
> >>>>
> >>>> 1. When the queue is empty with non-blocking recv(), errno is set
> >>>> to EPIPE but I think it should be EAGAIN (or maybe the pipe is
> >>>> getting broken for real of some reason ?)
> >>>>
> >>>> 2. When using non-blocking recv() and no message is written at all,
> >>>> it seems like recv() blocks forever
> >>>>
> >>>> 3. Using non-blocking recv() where the "client" does send less than
> >>>> "count" messages, sometimes recv() blocks forever (as well)
> >>>>
> >>>>
> >>>> My naïve analysis of this is that for the first issue (if any) the
> >>>> wrong errno is set and for the second issue it blocks if no
> >>>> sendto() is done after the first recv(), i.e. nothing kicks the "reader
> thread"
> >>>> in the butt to realise the queue is empty. It is not super clear
> >>>> though what POSIX says about creating blocking descriptors and then
> >>>> using non-blocking-flags with recv(), but this works in Linux any
> >>>> way
> >>>
> >>> The explanation is actually much simpler.  In the recv code where a
> >>> bound datagram socket waits for a remote socket to connect to the
> >>> pipe, I simply forget to handle MSG_DONTWAIT.  I've pushed a
> fix.  Please retest.
> >>>
> >>> I should add that in all my work so far on the topic/af_unix branch,
> >>> I've thought mainly about stream sockets.  So there may still be
> >>> things remaining to be implemented for the datagram case.
> >>
> >> I finally got some time to test topic/af_unix in our "real"
> >> cygwin-application
> >> (casual) and unfortunately very few of our unittests pass
> >>
> >> The symptoms are that there's unexpected eternal blocking, sometimes
> >> there's unexpected EADDRNOTAVAIL, sometimes it looks like some
> memory
> >> corruption (and
> >> core-dumps)
> >>
> >> Of course the memory corruption etc could be our self and the
> >> core-dumps might be because of uncaught exceptions
> >>
> >> Needles to say is that all unittests pass on Linux, but of course
> >> cygwin-topic/af_unix could act according to POSIX-standard and the
> >> behaviour couldbe due to our own misinterpretation of how POSIX works
> >
> > More likely it's due to bugs in the topic/af_unix branch.  This is
> > still very much a work in progress.
> >
> >> I will try to narrow down the quite complex logic and reproduce the
> >> problems
> >
> > That would be ideal.
> >
> >> If you of some reason wanna try it with casual, I'd be glad to help
> >> you out (it should be easier now that last time (but there might be
> >> some documentation missing for Cygwin still))
> >>
> >> https://bitbucket.org/casualcore/
> >
> > I'm going on vacation in a few days, but I might do this when I get back.
> >
> > Thanks for your testing.
> 
> By the way, if your code is using datagram sockets, then there are very serious
> problems with our implementation (even aside from the performance issue
> that we've already discussed).  For example, I don't know of any reasonable
> way for select to test whether such a socket is ready for writing.  We'll need to
> solve that somehow.

If you by that mean if we're using SOCK_DGRAM, the answer is yes

I tried SOCK_STREAM (and SOCK_SEQPACKET I think) for CYGWIN 3.2.0 but that didn't work at all

As far as I understand, both all types on pretty much all implementations preserves message ordering though

I haven't tried SOCK_STREAM and/or SOCK_SEQPACKET with the topic/af_unix-branch. Is that worth a try ?

Best regards,
Kristian

> Ken

--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019