diff mbox

[V2,2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number

Message ID 1499925175-21218-3-git-send-email-zhangchen.fnst@cn.fujitsu.com (mailing list archive)
State New, archived
Headers show

Commit Message

Zhang Chen July 13, 2017, 5:52 a.m. UTC
If primary packet's sequence number not same with secondary packet's
sequence number, no need to compare the packet other field.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Jason Wang July 14, 2017, 3:25 a.m. UTC | #1
On 2017年07月13日 13:52, Zhang Chen wrote:
> If primary packet's sequence number not same with secondary packet's
> sequence number, no need to compare the packet other field.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 0f8e198..2caeb80 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
>       ptcp = (struct tcphdr *)ppkt->transport_header;
>       stcp = (struct tcphdr *)spkt->transport_header;
>   
> +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> +        ptcp->th_seq != stcp->th_seq) {
> +        trace_colo_compare_main("colo_packet_compare_tcp seq not same");
> +        return -1;
> +    }
> +
>       /*
>        * The 'identification' field in the IP header is *very* random
>        * it almost never matches.  Fudge this by ignoring differences in

Do we have any statistics numbers for this?

Thanks
Dr. David Alan Gilbert July 14, 2017, 12:24 p.m. UTC | #2
* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> If primary packet's sequence number not same with secondary packet's
> sequence number, no need to compare the packet other field.
> 
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/colo-compare.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 0f8e198..2caeb80 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
>      ptcp = (struct tcphdr *)ppkt->transport_header;
>      stcp = (struct tcphdr *)spkt->transport_header;
>  
> +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> +        ptcp->th_seq != stcp->th_seq) {
> +        trace_colo_compare_main("colo_packet_compare_tcp seq not same");
> +        return -1;
> +    }

Do you need to check that the stcp->th_flags is the same ?

Looking back at patches I had in this area; I was doing
  if (ptcp->th_flags == stcp->th_flags &&

see:
   https://github.com/orbitfp7/qemu/commit/848ca1113aec802dd032fd5b6d6b301931b3e1e0

Dave

>      /*
>       * The 'identification' field in the IP header is *very* random
>       * it almost never matches.  Fudge this by ignoring differences in
> -- 
> 2.7.4
> 
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Zhang Chen July 17, 2017, 7:39 a.m. UTC | #3
On 07/14/2017 11:25 AM, Jason Wang wrote:
>
>
> On 2017年07月13日 13:52, Zhang Chen wrote:
>> If primary packet's sequence number not same with secondary packet's
>> sequence number, no need to compare the packet other field.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 0f8e198..2caeb80 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet *spkt, 
>> Packet *ppkt)
>>       ptcp = (struct tcphdr *)ppkt->transport_header;
>>       stcp = (struct tcphdr *)spkt->transport_header;
>>   +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
>> +        ptcp->th_seq != stcp->th_seq) {
>> +        trace_colo_compare_main("colo_packet_compare_tcp seq not 
>> same");
>> +        return -1;
>> +    }
>> +
>>       /*
>>        * The 'identification' field in the IP header is *very* random
>>        * it almost never matches.  Fudge this by ignoring differences in
>
> Do we have any statistics numbers for this?

Rethink about this patch, I will remove it in next version and send a 
independent
patch in the future.
Because in FTP get test, primary guest send lots of packet differ to 
secondary guest's,
the packet payload are not same, but the total payload are same.
I think I have to buffer some packet's payload depend on sequence number 
for comparison?
Any idea about this?

Thanks
Zhang Chen

>
> Thanks
>
>
>
Dr. David Alan Gilbert July 17, 2017, 8:55 a.m. UTC | #4
* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> 
> 
> On 07/14/2017 11:25 AM, Jason Wang wrote:
> > 
> > 
> > On 2017年07月13日 13:52, Zhang Chen wrote:
> > > If primary packet's sequence number not same with secondary packet's
> > > sequence number, no need to compare the packet other field.
> > > 
> > > Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> > > ---
> > >   net/colo-compare.c | 6 ++++++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/net/colo-compare.c b/net/colo-compare.c
> > > index 0f8e198..2caeb80 100644
> > > --- a/net/colo-compare.c
> > > +++ b/net/colo-compare.c
> > > @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet
> > > *spkt, Packet *ppkt)
> > >       ptcp = (struct tcphdr *)ppkt->transport_header;
> > >       stcp = (struct tcphdr *)spkt->transport_header;
> > >   +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> > > +        ptcp->th_seq != stcp->th_seq) {
> > > +        trace_colo_compare_main("colo_packet_compare_tcp seq not
> > > same");
> > > +        return -1;
> > > +    }
> > > +
> > >       /*
> > >        * The 'identification' field in the IP header is *very* random
> > >        * it almost never matches.  Fudge this by ignoring differences in
> > 
> > Do we have any statistics numbers for this?
> 
> Rethink about this patch, I will remove it in next version and send a
> independent
> patch in the future.
> Because in FTP get test, primary guest send lots of packet differ to
> secondary guest's,
> the packet payload are not same, but the total payload are same.

Do you mean that the TCP stream is the same but the packet sizes are
different due to different fragmentation?

> I think I have to buffer some packet's payload depend on sequence number for
> comparison?
> Any idea about this?

The original COLO discussions ~2-3 years ago talked about performing TCP
reassembly and comparing the TCP stream; not a simple task.

But the version I worked with also had the rewrite of the sequence
numbers on the secondary to cause them to match even with the same
fragmentation - but that doesn't seem to be upstream yet.

Dave

> 
> Thanks
> Zhang Chen
> 
> > 
> > Thanks
> > 
> > 
> > 
> 
> -- 
> Thanks
> Zhang Chen
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Zhang Chen July 17, 2017, 9:23 a.m. UTC | #5
On 07/17/2017 04:55 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>>
>> On 07/14/2017 11:25 AM, Jason Wang wrote:
>>>
>>> On 2017年07月13日 13:52, Zhang Chen wrote:
>>>> If primary packet's sequence number not same with secondary packet's
>>>> sequence number, no need to compare the packet other field.
>>>>
>>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>>> ---
>>>>    net/colo-compare.c | 6 ++++++
>>>>    1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>>> index 0f8e198..2caeb80 100644
>>>> --- a/net/colo-compare.c
>>>> +++ b/net/colo-compare.c
>>>> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet
>>>> *spkt, Packet *ppkt)
>>>>        ptcp = (struct tcphdr *)ppkt->transport_header;
>>>>        stcp = (struct tcphdr *)spkt->transport_header;
>>>>    +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
>>>> +        ptcp->th_seq != stcp->th_seq) {
>>>> +        trace_colo_compare_main("colo_packet_compare_tcp seq not
>>>> same");
>>>> +        return -1;
>>>> +    }
>>>> +
>>>>        /*
>>>>         * The 'identification' field in the IP header is *very* random
>>>>         * it almost never matches.  Fudge this by ignoring differences in
>>> Do we have any statistics numbers for this?
>> Rethink about this patch, I will remove it in next version and send a
>> independent
>> patch in the future.
>> Because in FTP get test, primary guest send lots of packet differ to
>> secondary guest's,
>> the packet payload are not same, but the total payload are same.
> Do you mean that the TCP stream is the same but the packet sizes are
> different due to different fragmentation?

Yes, like that:
We send this payload: "1234567890".

primary:
pkt1 payload:"123"
pkt2 payload:"4567890"

secondary:
pkt1 payload:"1234567890"


>
>> I think I have to buffer some packet's payload depend on sequence number for
>> comparison?
>> Any idea about this?
> The original COLO discussions ~2-3 years ago talked about performing TCP
> reassembly and comparing the TCP stream; not a simple task.
>
> But the version I worked with also had the rewrite of the sequence
> numbers on the secondary to cause them to match even with the same
> fragmentation - but that doesn't seem to be upstream yet.

In current qemu upstream we use filter-rewriter to rewrite the sequence
numbers on the secondary, but we can not avoid different fragmentation 
in two side.
Any comments about guarantee the primary side and the secondary side 
have the same fragmentation?

Thanks
Zhang Chen

>
> Dave
>
>> Thanks
>> Zhang Chen
>>
>>> Thanks
>>>
>>>
>>>
>> -- 
>> Thanks
>> Zhang Chen
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>
Dr. David Alan Gilbert July 17, 2017, 10:02 a.m. UTC | #6
* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> 
> 
> On 07/17/2017 04:55 PM, Dr. David Alan Gilbert wrote:
> > * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> > > 
> > > On 07/14/2017 11:25 AM, Jason Wang wrote:
> > > > 
> > > > On 2017年07月13日 13:52, Zhang Chen wrote:
> > > > > If primary packet's sequence number not same with secondary packet's
> > > > > sequence number, no need to compare the packet other field.
> > > > > 
> > > > > Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> > > > > ---
> > > > >    net/colo-compare.c | 6 ++++++
> > > > >    1 file changed, 6 insertions(+)
> > > > > 
> > > > > diff --git a/net/colo-compare.c b/net/colo-compare.c
> > > > > index 0f8e198..2caeb80 100644
> > > > > --- a/net/colo-compare.c
> > > > > +++ b/net/colo-compare.c
> > > > > @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet
> > > > > *spkt, Packet *ppkt)
> > > > >        ptcp = (struct tcphdr *)ppkt->transport_header;
> > > > >        stcp = (struct tcphdr *)spkt->transport_header;
> > > > >    +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> > > > > +        ptcp->th_seq != stcp->th_seq) {
> > > > > +        trace_colo_compare_main("colo_packet_compare_tcp seq not
> > > > > same");
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > >        /*
> > > > >         * The 'identification' field in the IP header is *very* random
> > > > >         * it almost never matches.  Fudge this by ignoring differences in
> > > > Do we have any statistics numbers for this?
> > > Rethink about this patch, I will remove it in next version and send a
> > > independent
> > > patch in the future.
> > > Because in FTP get test, primary guest send lots of packet differ to
> > > secondary guest's,
> > > the packet payload are not same, but the total payload are same.
> > Do you mean that the TCP stream is the same but the packet sizes are
> > different due to different fragmentation?
> 
> Yes, like that:
> We send this payload: "1234567890".
> 
> primary:
> pkt1 payload:"123"
> pkt2 payload:"4567890"
> 
> secondary:
> pkt1 payload:"1234567890"

Yes; I think it comes down to very fine grain timing and interaction
with nagling; if the guest is that bit slower in generating the output,
the network code will decide to send it.

> > 
> > > I think I have to buffer some packet's payload depend on sequence number for
> > > comparison?
> > > Any idea about this?
> > The original COLO discussions ~2-3 years ago talked about performing TCP
> > reassembly and comparing the TCP stream; not a simple task.
> > 
> > But the version I worked with also had the rewrite of the sequence
> > numbers on the secondary to cause them to match even with the same
> > fragmentation - but that doesn't seem to be upstream yet.
> 
> In current qemu upstream we use filter-rewriter to rewrite the sequence
> numbers on the secondary, but we can not avoid different fragmentation in
> two side.
> Any comments about guarantee the primary side and the secondary side have
> the same fragmentation?

I don't think you can; the only choice is to perform the comparison
after de-fragmentation - or to do the same thing by building your own
reassembly.

Dave

> Thanks
> Zhang Chen
> 
> > 
> > Dave
> > 
> > > Thanks
> > > Zhang Chen
> > > 
> > > > Thanks
> > > > 
> > > > 
> > > > 
> > > -- 
> > > Thanks
> > > Zhang Chen
> > > 
> > > 
> > > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > 
> > .
> > 
> 
> -- 
> Thanks
> Zhang Chen
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox

Patch

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 0f8e198..2caeb80 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -222,6 +222,12 @@  static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
     ptcp = (struct tcphdr *)ppkt->transport_header;
     stcp = (struct tcphdr *)spkt->transport_header;
 
+    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
+        ptcp->th_seq != stcp->th_seq) {
+        trace_colo_compare_main("colo_packet_compare_tcp seq not same");
+        return -1;
+    }
+
     /*
      * The 'identification' field in the IP header is *very* random
      * it almost never matches.  Fudge this by ignoring differences in