-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlibcurl-tutorial.html
1477 lines (1405 loc) · 86.8 KB
/
libcurl-tutorial.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html><head> <title>libcurl - programming tutorial</title>
<meta content="text/html; charset=utf8" http-equiv="Content-Type">
<!--
<link rel="STYLESHEET" type="text/css" href="libcurl-tutorial_files/curl.css">
<link rel="shortcut icon" href="http://curl.haxx.se/favicon.ico">
<link rel="STYLESHEET" type="text/css" href="libcurl-tutorial_files/manpage.css">
-->
<style>
P.level0 {
padding-left: 2em;
}
P.level1 {
padding-left: 4em;
}
P.level2 {
padding-left: 6em;
}
span.emphasis {
font-style: italic;
}
span.bold {
font-weight: bold;
}
span.manpage {
font-weight: bold;
}
h2.nroffsh {
background-color: #e0e0e0;
}
span.nroffip {
font-weight: bold;
font-size: 120%;
font-family: monospace;
}
p.roffit {
text-align: center;
font-size: 80%;
}
<!--
BODY {
background-color: white;
font-family: arial, helvetica, ariel, sans-serif;
font-size: 90%;
color: black;
}
/* the curl-style blue main page titles */
.pagetitle, h1.libtitle, .newstitle {
border-style: solid;
border-width: thin;
border-color: black;
background-color: #e0e0ff;
font-family: arial, helvetica, ariel, sans-serif;
color: #0000ff;
font-size: 150%;
font-weight: bold;
padding: 0px 4px 0px 4px;
}
.title {
color: #000000;
background-color: #f0f0ff;
font-family: arial, helvetica, ariel, sans-serif;
font-size: 120%;
}
h2 {
font-family: arial, helvetica, ariel, sans-serif;
background-color: #f0f0ff;
}
.subtitle {
font-weight: bold;
}
tr.tabletop {
font-size: 120%;
font-family: sans-serif;
color: #ffffff;
background-color: #0000ff;
}
.tabletop a {
color: #ffffff;
}
.buildfail {
color: #000000;
background-color: #ff8080;
}
.buildserverprob {
color: #000000;
background-color: #ffff00;
}
.buildfine {
color: #000000;
background-color: #00ff00;
}
.compile, .changetable {
border: outset 2px #ffffff;
}
table.news {
border: outset 2px #8080ff;
}
table.secbox {
border: outset 2px #8080ff;
background-color: #e0e0ff;
}
table.latestmail {
border: outset 2px #8080ff;
font-size: 100%;
}
/*
.newstitle {
font-size: 120%;
white-space: nowrap;
background-color: #0000ff;
color: #ffffff;
}
*/
td.newsdate {
text-align: right;
}
.mini {
font-size: 8pt;
font-family: monospace;
}
.warning {
font-size: 9pt;
color: #000000;
background-color: #ff8080;
}
/* like used for important news items */
.alert {
font-size: 120%;
color: #ffffff;
background-color: #f00000;
padding: 4px 4px 4px 4px;
}
/* the one in the download database */
td.desc {
color: #000000;
background: #e0e0ff;
}
/* used in the download table for a single OS */
.ostitleleft {
color: #000000;
background-color: #ffffff;
font-family: arial, helvetica, ariel, sans-serif;
font-size: 120%;
font-weight: bold;
padding: 4px 4px 4px 4px;
border-top: 1px solid black;
border-left: 1px solid black;
border-right: none;
}
.ostitleright {
color: #000000;
background-color: #ffffff;
font-family: arial, helvetica, ariel, sans-serif;
font-size: 120%;
font-weight: bold;
padding: 4px 4px 4px 4px;
border-top: 1px solid black;
border-right: 1px solid black;
border-left: none;
}
td.ostitle {
color: #000000;
background-color: #ffffff;
font-family: arial, helvetica, ariel, sans-serif;
font-size: 120%;
font-weight: bold;
padding: 4px 4px 4px 4px;
margin: 0px 0px 0px 0px;
border-top: 1px solid black;
border-left: 1px solid black;
border-right: 1px solid black;
}
/* download page */
.latest2 {
color: #000000;
background-color: #ffff44;
border-left: 1px solid black;
border-right: 1px solid black;
margin: 0px 0px 0px 0px;
padding: 1px 0px 1px 0px;
}
/* download page */
.older2 {
color: #000000;
background-color: #ffffff;
border-left: 1px solid black;
border-right: 1px solid black;
margin: 0px 0px 0px 0px;
padding: 1px 0px 1px 0px;
}
.col1 {
border-left: 1px solid black;
margin: 0px 0px 0px 0px;
border-bottom: none;
border-top: none;
}
.col7 {
margin: 0px 0px 0px 0px;
border-right: 1px solid black;
border-bottom: none;
border-top: none;
}
.col2, .col3, .col4, .col5, .col6 {
margin: 0px 0px 0px 0px;
padding: 0px 4px 0px 4px;
}
.osend {
border-top: 1px solid black;
margin: 0px 0px 10px 0px;
}
.download2 {
background-color: #e0e0e0;
padding: 10px 10px 10px 10px;
}
/* used on the version number lines in the verdiff.cgi script */
span.version {
font-weight: bold;
font-size: 120%;
border-style: solid;
border-width: thin;
border-color: black;
padding: 4px 4px 4px 4px;
}
/* used on the who/date lines in the verdiff.cgi script */
span.whodate {
font-weight: bold;
}
tr.odd {
color: #000000;
background-color: #e0e0e0;
}
/* used for Metalink download links */
span.metalink {
font-weight: bold;
font-family: helvetica narrow, arial narrow, sans-serif;
}
/* a non-selected item in the main menu */
.menuitem, .libitem {
text-decoration: none;
color: #ffffff;
background-color: #4040ff;
white-space: nowrap;
padding: 0px 0px 0px 0px;
font-family: sans-serif;
}
/* a non-selected item in the main menu, with the mouse hovering */
.menuitem:hover, .libitem:hover {
text-decoration: none;
color: #ffffff;
background-color: #202080;
white-space: nowrap;
padding: 0px 0px 0px 0px;
font-family: sans-serif;
}
/* a selected item in the main menu */
.itemselect, .libselect {
font-family: sans-serif;
text-decoration: none;
background-color: white;
font-weight: bold;
color: black;
white-space: nowrap;
padding: 0px 2px 0px 2px;
}
/* the main curl site left-side menu box */
.mainmenu, .libmenu {
color: #ffffff;
background-color: #4040ff;
padding: 8px 4px 8px 2px;
}
/* the about infobox in the right-bottom */
.aboutbox {
border-color: black;
border-width: 2px;
border-style: solid;
background-color: #e0e0ff;
color: #000000;
font-size: 80%;
float: right;
text-align: left;
width: 15em;
padding: 2px 2px 2px 2px;
}
/* the toplevel centered links */
.ad a {
border-color: #c0c0ff;
border-style: outset;
text-align: center;
text-decoration: none;
color: #ffffff;
background-color: #4040ff;
margin: 0px 8px 0px 8px;
padding: 0px 3px 0px 3px;
}
.relatedbox {
border-color: black;
border-width: 1px;
border-style: solid;
color: #000000;
float: right;
text-align: left;
padding: 2px 2px 2px 2px;
margin: 4px 4px 4px 4px;
background: white;
}
div.oslinks {
clear: right;
border-color: black;
border-width: 1px;
border-style: outset;
color: #000000;
float: right;
text-align: right;
padding: 2px 2px 2px 2px;
background: white;
font-size: 80%;
}
div.pollbox {
border-color: black;
border-width: 2px;
border-style: outset;
font-size: 80%;
color: #000000;
float: right;
text-align: left;
padding: 0px 2px 0px 2px;
background: #ffffe0;
}
p.ingres {
width: 80%;
font-family: arial, helvetica, ariel, sans-serif;
font-style: italic;
margin-left: auto;
margin-right: auto;
}
div.quote {
border-style: solid;
border-width: thin;
border-color: black;
padding: 2px 2px 2px 2px;
background-color: #f0f0ff;
width: 90%;
margin-left: 5%;
margin-right: 5%;
}
/* yellow box */
div.yellowbox {
border-style: outset;
border-width: 3px;
border-color: black;
padding: 2px 2px 2px 2px;
background-color: #fffff0;
/*
width: 90%;
margin-left: 5%; */
}
.mirrorlinks {
font-size: 8pt;
}
.phpbox {
background-color: #ffe0e0;
color: #0000ff;
border: solid 1px #000000;
padding: 4px 4px 4px 4px;
font-size: 120%;
}
.bindingbox {
float: right;
border: solid 1px #000000;
background-color: #ffffff;
padding: 2px 2px 2px 2px;
}
.bottomad {
border: solid 1px #000000;
background-color: #e0e0e0;
padding: 5px 5px 5px 5px;
margin-left: auto;
margin-right: auto;
}
-->
</style>
</head>
<body alink="red" bgcolor="#ffffff" link="#0000ff" text="#000000" vlink="#808080">
<!-- first-line-in-body -->
<table cellpadding="5" cellspacing="0"><tbody><tr>
<td align="left" valign="top" width="90%">
<a href="http://curl.haxx.se/">cURL</a> >> <a href="http://curl.haxx.se/libcurl/">libcurl</a> >> <a href="http://curl.haxx.se/libcurl/c/">API</a> >> <b>programming with libcurl</b>
<h1 class="pagetitle"> libcurl-tutorial.3 -- man page </h1>
<div class="relatedbox">
<b>Related:</b>
<br><a href="http://curl.haxx.se/libcurl/c/example.html">示例</a>
<br><a href="http://curl.haxx.se/libcurl/c/">API</a>
</div>
<!-- generated with roffit 0.7 -->
<p class="level0"><a name="NAME"></a></p><h2 class="nroffsh">题目</h2>
<p class="level0">libcurl-教程 - libcurl编程指南 <span class="emphasis">原文<a href="http://curl.haxx.se/libcurl/c/libcurl-tutorial.html">http://curl.haxx.se/libcurl/c/libcurl-tutorial.html</a> </span> <a name="Objective"></a></p><h2 class="nroffsh">目标</h2>
<p class="level0">这篇文档试图讲述在使用libcurl编程时的一些通用原理和基本方法。本文以C语言接口讲解为主,也适用于与C语言风格相近的语言。
</p><p class="level0">文档中用“用户”一词指代使用libcurl的开发人员。文中的“程序”一词泛指用libcurl写的代码。
</p><p class="level0">更多关于libcurl选项和函数的细节,请参考官方的参考手册。
</p><p class="level0"><a name="Building"></a></p><h2 class="nroffsh">编译源码</h2>
<p class="level0">开发C程序有很多方式,本文假定你使用Unix风格的环境。如果你使用不同的开发环境,你仍然可以从本文中获得适合于你的一些通用信息。
</p><p class="level0"><a name="Compiling"></a><span class="nroffip">编译</span>
</p><p class="level1">你的编译器需要知道libcurl的头文件在哪里。因此,你必须将libcurl的安装路径放到编译器的头文件搜索路径里。curl-config工具可以获得libcurl安装的一些信息。
</p><p class="level1">$ curl-config --cflags
</p><p class="level1">
</p><p class="level0"><a name="Linking"></a><span class="nroffip">链接</span>
</p><p class="level1">完成程序的编译后,你需要将目标文件链接到一起,形成单一的可执行文件。你需要链接libcurl,可能还要链接libcurl本身依赖的库文件,比如OpenSSL库。当然,还可能要链接一些操作系统的库文件。要弄清楚到底使用哪些编译选项,再一次请出我们的curl-config工具。
</p><p class="level1">$ curl-config --libs
</p><p class="level1">
</p><p class="level0"><a name="SSL"></a><span class="nroffip">是否支持SSL</span>
</p><p class="level1">libcurl可以有多种形式的定制编译,其中之一就是是否支持SSL,比如HTTPS和FTPS。编译libcurl过程中检测到有支持的SSL库时,就会启用SSL支持。要判断当前的libcurl库是否支持SSL,使用:
</p><p class="level1">$ curl-config --feature
</p><p class="level1">如果支持SSL,则会打出“SSL”的关键字,同时还会有一些其它可定制属性。
</p><p class="level1">也可以参考下文“libcurl提供的属性”。
</p><p class="level0"><a name="autoconf"></a><span class="nroffip">autoconf 宏</span>
</p><p class="level1">如果你要自己写配置脚本来检测libcurl和设置选项,我们提供了一个预先写好的宏,基本能完成你需要的一切。请参考docs/libcurl/libcurl.m4文件,它会告诉你如何使用这个宏。
</p><p class="level1"><a name="Portable"></a></p><h2 class="nroffsh">可移植代码</h2>
<p class="level0">libcurl的幕后工作者花了很精力使libcurl能在很多系统和环境中工作。
</p><p class="level0">你可以在各种各样的平台上用同样的方式编写libcurl程序,只需要考虑非常少的一些差异。只要保证你写的代码有很好的移植性,最后程序则能具备良好的跨平台能力。libcurl不会拖你的后腿
</p><p class="level0"><a name="Global"></a></p><h2 class="nroffsh">全局初始化</h2>
<p class="level0">在使用libcurl前,必须先做一次libcurl的全局初始化。不管程序中多少次使用libcurl,只需要做一次这样的初始化。使用:
</p><p class="level0"> curl_global_init()
</p><p class="level0">curl_global_init()带一个位匹配的参数,指定libcurl初始化选项。
<span class="emphasis">CURL_GLOBAL_ALL</span>
会初始化所有已知子模块,是较好的默认选择。另外两个可指定的位是:
</p><p class="level1">
</p><p class="level0"><a name="CURLGLOBALWIN32"></a><span class="nroffip">CURL_GLOBAL_WIN32</span>
</p><p class="level1">将会在windows环境里初始化libcurl,实质上就是初始化win32 socket库。如果不指定这个位,你的程序就不能正常使用socket。同样,你只需要为你的程序做一次wine32 socket的初始化,如果你自己的代码或者其它模块已经做了这一步,那不应该在libcurl初始化时指定这个位。
</p><p class="level0"><a name="CURLGLOBALSSL"></a><span class="nroffip">CURL_GLOBAL_SSL</span>
</p><p class="level1">如果libcurl编译时启用了SSL,指定这个位会初始化SSL模块。同理,SSL只需要在程序中初始化一次,如果你的代码或者其它库已经做了这一步,就不应该再libcurl初始化时指定这个位。
</p><p class="level0">
</p><p class="level0">libcurl有一个默认的保障机制:如果你调用<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_perform.html">curl_easy_perform(3)</a>时还没有做<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_global_init.html">curl_global_init(3)</a>
libcurl会使用默认的参数自动执行一次<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_global_init.html">curl_global_init(3)</a>。注意,这不是一种好的选择。
</p><p class="level0">当你的程序不在需要libcurl时,应该调用<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_global_cleanup.html">curl_global_cleanup(3)</a>,与init函数相反,它会释放和清理掉相关资源。
</p><p class="level0">要尽量避免在程序中多次使用<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_global_init.html">curl_global_init(3)</a> 和<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_global_cleanup.html">curl_global_cleanup(3)</a>,它们应该被各调用一次。
</p><p class="level0"><a name="Features"></a></p><h2 class="nroffsh">libcurl提供的属性</h2>
<p class="level0">判断libcurl支持属性的最恰当的方式是在运行时,而不是在编译阶段。调用<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_version_info.html">curl_version_info(3)</a>并检查它返回的结构,可以准确的知道当前使用的libcurl支持的属性。
</p><p class="level0"><a name="Handle"></a></p><h2 class="nroffsh">使用Easy libcurl</h2>
<p class="level0">
libcurl最先引入的是称为“easy interface”的接口,它们都以“curl_easy”开头。
</p><p class="level0">较新版本的libcurl也提供称为“multi interface”的接口。关于multi interface的作用与使用将在下文单独阐述。在那之前,需要先理解easy interface。
</p><p class="level0">要使用easy interface,需要先创建一个easy handle。每个HTTP会话都应该对应一个easy handle。每个线程都应该独立的使用handle,一定不要在多线程中共享同一个handle。
</p><p class="level0">使用下面的函数来创建一个handle:
</p><p class="level0"> easyhandle = curl_easy_init();
</p><p class="level0">它会返回一个easy handle。拿到handle,下一步就是设定你要完成的动作。这里的handle是为接下来的数据传输而定义一个逻辑体。
</p><p class="level0">使用<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_setopt.html">curl_easy_setopt(3)</a>来给handle设置操作和属性。它们控制接下来的数据传输的行为。这些设置会一直保持在 handle里,直到你下一次设置它。多个网络传输请求在使用同一个handle时,会共享同样的属性。
</p><p class="level0">很多handle的选项都是以\0结束的字符串。使用 <a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_setopt.html">curl_easy_setopt(3)</a> 设置下去的值在libcurl内有一个拷贝,不必在你的代码中保留。[4]
</p><p class="level0">最常见最基本的handle选项就是URL,使用类似下面的方法来设置handle的URL:
</p><p class="level0"></p><pre><p class="level0"> curl_easy_setopt(handle, CURLOPT_URL, "<a href="http://domain.com/">http://domain.com/</a>");
</p></pre>
<p class="level0">
</p><p class="level0">
假设你需要拿到这个URL指向的远程资源,然后用libcurl写了一段程序来实现数据获取。你可能希望数据能直接传送到你的手中,而不是输出到stdout,那么,你可以写一个这样的回调函数(callback function)。
</p><p class="level0"> size_t write_data(void *buffer, size_t size, size_t nmemb, void *userp);
</p><p class="level0">
然后用下面的方式,告诉libcurl将数据通过这个函数传送给你。
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_WRITEFUNCTION, write_data);
</p><p class="level0">你还用下面的方式设定callback的参数,libcurl会在调用callback函数时带上它(即前面write_data的第四个参数)。
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_WRITEDATA, &internal_struct);
</p><p class="level0">利用这一点,我们可以很轻松的在程序和libcurl的回调中传递一些信息。libcurl不会碰你用<span class="emphasis">CURLOPT_WRITEDATA</span>设定的数据,而是原封不动地传递给callback函数。
</p><p class="level0">如果你没有用<span class="emphasis">CURLOPT_WRITEFUNCTION</span>指定callback函数,libcurl会默认用它自己的callback函数。 默认的callback函数只是简单的将接收到的数据写到stdout。你可以用<span class="emphasis">CURLOPT_WRITEDATA</span>方式传递一个可写的“FILE*”指针给默认callback,让callback把数据写入该文件。
</p><p class="level0">现在,让我们退一步,做一个深呼息,因为我们马上就会遇上前面提到的第一个平台差异问题。
在一些平台上,libcurl无法操作程序里打开的文件,因此,如果你使用了默认的callback,并用<span class="emphasis">CURLOPT_WRITEDATA</span>传递一个打开的文件指针,程序可能会崩溃。因此你应该尽量避免这种用法,以使你的代码有更好的移植性。
</p><p class="level0">(<span class="emphasis">CURLOPT_WRITEDATA</span> 正式地也被称为 <span class="emphasis">CURLOPT_FILE</span>. 这两个名字都有效,并且功能一样。)
</p><p class="level0">如果你是以win32 DLL的方式使用libcurl,那么在使用<span class="emphasis">CURLOPT_WRITEDATA</span>时,则必须同时使用<span class="emphasis">CURLOPT_WRITEFUNCTION</span> 注册回调函数。- 否则你会遇到程序崩溃。
</p><p class="level0">当然libcurl还有很多选项可供设置,我们稍后再看。接下来我们先执行数据传输工作:
</p><p class="level0"> success = curl_easy_perform(easyhandle);
</p><p class="level0"><a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_perform.html">curl_easy_perform(3)</a>
函数会连接到远程站点,发送一些必要的命令,并接收数据传输。收到数据时,libcurl会调用我们前面设置的callback函数。该函数可能一次拿到一个字节,也可能一次拿到若干K字节。libcurl会尽可能多尽可能快的传递数据。你的callback应该返回已经被处理过的字节数,如果返回值 不等于传递给它的字节数,libcurl会中止操作,并返回一个错误码。
</p><p class="level0">传输完成时,<a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_perform.html">curl_easy_perform(3)</a>的返回值会通知你传输任务成功与否。如果返回值不够直观,你也可以用CURLOPT_ERRORBUFFER来设置一个buffer,libcurl会在里面写入许多可读的错误信息。
</p><p class="level0">一个任务结束后,handle可以被再次使用。推荐重用已经存在的handle来进行下一个传输,libcurl会尝试重用先前的连接。
</p><p class="level0">对某些协议来说,下载一个文件可能涉及到复杂的流程,比如登录,设置传输模式,改变当前目录等,最后再到传输数据。libcurl会为你搞定这些复杂的流程,只需要简单的给libcurl一个URL,它就会自动完成剩下的工作。
</p><p class="level0"><a name="Multi-threading"></a></p><h2 class="nroffsh">多线程问题</h2>
<p class="level0">第一个基本原则是<span class="bold">绝对不要</span>在线程之间共享同一个libcurl handle,不管是easy handle还是multi handle。一个线程每次只能使用一个handle。
</p><p class="level0">
除开信号和SSL/TLS处理器两种情况,libcurl都是线程安全的。在非windows环境下,如果libcurl没有把c-ares库编译进来,则会用信号的方式来指示域名解析超时。
</p><p class="level0">
如果你以多线程的方式访问HTTPS或者FTPS这类的URL,相应的你也在以多线程方式使用到底层的SSL库。各种SSL库对多线程使用也有自己的要求,具体可以参考:
</p><p class="level0">OpenSSL
</p><p class="level0"> <a href="http://www.openssl.org/docs/crypto/threads.html">http://www.openssl.org/docs/crypto/threads.html</a>#DESCRIPTION
</p><p class="level0">GnuTLS
</p><p class="level0"> <a href="http://www.gnu.org/software/gnutls/manual/html_node/">http://www.gnu.org/software/gnutls/manual/html_node/</a>Multi_002dthreaded-applications.html
</p><p class="level0">NSS
</p><p class="level0"> 宣称线程安全的,不需要额外工作。
</p><p class="level0">PolarSSL
</p><p class="level0"> 暂不清楚(Required actions unknown).
</p><p class="level0">yassl
</p><p class="level0"> 暂不清楚(Required actions unknown).
</p><p class="level0">axTLS
</p><p class="level0"> 暂不清楚(Required actions unknown).
</p><p class="level0">使用多线程时,你应该给每个handle设置CURLOPT_NOSIGNAL为1。该选项对libcurl的绝大部分工作无影响,除了一点:DNS查询时无法正常的超时。这个问题可以在编译时增加c-ares库以弥补。c-ares是一个提供异步域名解析的库。实际上,在某些平台上如果不具备c-ares,则libcurl是无法工作在多线程模式下的。
</p><p class="level0">另外要注意,使用CURLOPT_DNS_USE_GLOBAL_CACHE并不是线程安全的。
</p><p class="level0"><a name="When"></a></p><h2 class="nroffsh">当libcurl出问题时</h2>
<p class="level0">实际应用中总会遇到libcurl传输失败的情况,原因是多种多样的。可能你设置了错误的选项,或者没有理解选项的正确用法,或者远程服务器返回了不标准的回复让libcurl出错进而导致程序出问题。
</p><p class="level0">发生这些情况时,有一个黄金法则:把CURLOPT_VERBOSE选项设为1。这个选项使libcurl输出所有协议细节,包括发送的协议内容、libcurl内部信息以及接收到的协议数据(特别是FTP协议)等。
使用HTTP时,用这种方式来研究接收数据也是理解服务器端行为的一个很不错的方式。
</p><p class="level0">当然,libcurl也会有bug的。所以如果你发现了bug,请知会我们,以便我们修复它。当你发现了可能是bug的问题,请提供尽可能多的细节给我们:打开了CURLOPT_VERBOSE的libcurl输出,libcurl的版本,使用libcurl的代码,操作系统及其版本,编译器名字及版本,等等。
</p><p class="level0">如果CURLOPT_VERBOSE还不够,你可以用CURLOPT_DEBUGFUNCTION来提高debug信息的级别。
</p><p class="level0">深入理解相关协议总是好事,如果你研究过协议的RFC文档,你就能更好的要理解libcurl的原理,更好的使用它。
</p><p class="level0"><a name="Upload"></a></p><h2 class="nroffsh">上传数据到服务器</h2>
<p class="level0">对大多数工作来说,libcurl试图保持协议无关性。比如,上传数据到FTP跟用PUT方式发送数据到HTTP服务器,两者操作就非常相似。
</p><p class="level0">当然,一开始仍然需要创建一个easy handle,或者重用现存的handle。然后像之前一样指定一个URL,即我们要上传的URL。
</p><p class="level0">通常应用程序希望自己提供上传数据给libcurl,为此,我们可以设置一个读取的callback函数以及callback的参数。
该callback函数的原型应该像这样:
Since we write an application, we most likely want
libcurl to get the upload data by asking us for it. To make it do that,
we set the read callback and the custom pointer libcurl will pass to
our read callback. The read callback should have a prototype similar to:
</p><p class="level0"> size_t function(char *bufptr, size_t size, size_t nitems, void *userp);
</p><p class="level0">bufptr指向我们的要上传的数据,size*nitems是数据的大小,
同时也是该函数能交给libcurl的数据的最大值。userp则指向一个自定义的数据结构,
用来把私有数据传递给回调函数。here bufptr is the pointer to a buffer we fill in
with data to upload and size*nitems is the size of the buffer and
therefore also the maximum amount of data we can return to libcurl in
this call. The 'userp' pointer is the custom pointer we set to point to a
struct of ours to pass private data between the application and the
callback.
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, read_function);
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_READDATA, &filedata);
</p><p class="level0">告诉libcurl我们要上传:Tell libcurl that we want to upload:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_UPLOAD, 1L);
</p><p class="level0">在预先不知道要上传的文件的大小的话,某些协议会不能正常工作。
因此,用CURLOPT_INFILESIZE_LARGE设置文件大小。A few protocols won't behave properly when uploads
are done without any prior knowledge of the expected file size. So, set
the upload file size using the CURLOPT_INFILESIZE_LARGE for all known
file sizes like this[1]:
</p><p class="level0"></p><pre><p class="level0"> /* 在这个例子中,file_size必须是curl_off_t类型的变量。in this example, file_size must be an curl_off_t variable */
curl_easy_setopt(easyhandle, CURLOPT_INFILESIZE_LARGE, file_size);
</p></pre>
<p class="level0">
</p><p class="level0">当你调用When you call <a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_perform.html">curl_easy_perform(3)</a>
时,它先执行一些必要操作,然后会调用你提供的callback函数来读取待上传的数据。
每次callback被调用时都应该尽可能多的把数据交给libcurl,以便数据尽可能快的传输。
callback应该返回写到buffer中的字节数,返回0时则表示上传已经完成。this time, it'll perform all the necessary operations and when it has
invoked the upload it'll call your supplied callback to get the data to
upload. The program should return as much data as possible in every
invoke, as that is likely to make the upload perform as fast as
possible. The callback should return the number of bytes it wrote in the
buffer. Returning 0 will signal the end of the upload.
</p><p class="level0"><a name="Passwords"></a></p><h2 class="nroffsh">密码 Passwords</h2>
<p class="level0">
许多协议要求你在上传或者下载数据时提供用户名和密码。libcurl提供了多种方式来指定用户名和密码。
Many protocols use or even require that user name and
password are provided to be able to download or upload the data of your
choice. libcurl offers several ways to specify them.
</p><p class="level0">大多数的协议支持直接在URL中指定账号和密码,libcurl会检测并应用这种形式。就像这样:
Most protocols support that you specify the name
and password in the URL itself. libcurl will detect this and use them
accordingly. This is written like this:
</p><p class="level0"> protocol://user:[email protected]/path/
</p><p class="level0">如果你在用户名或者密码中使用了一些不常见的字符,应该先把它们编码成形如%xx的URL,xx是一个2位的十六进制数字,If you need any odd letters in your user name or
password, you should enter them URL encoded, as %XX where XX is a
two-digit hexadecimal number.
</p><p class="level0">libcurl同时也提供选项来设置各种密码。出现在URL中的账号和密码可以用CURLOPT_USERPWD选项来设置。
相应的参数是一个“用户名:密码”形式的字符串。
libcurl also provides options to set various
passwords. The user name and password as shown embedded in the URL can
instead get set with the CURLOPT_USERPWD option. The argument passed to
libcurl should be a char * to a string in the format "user:password". In
a manner like this:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_USERPWD, "myname:thesecret");
</p><p class="level0">另一个需要账号密码的情况是需要获得代理服务器的授权,libcurl提供了另一个选项, CURLOPT_PROXYUSERPWD,用法跟CURLOPT_USERPWD非常相似。Another case where name and password might be
needed at times, is for those users who need to authenticate themselves
to a proxy they use. libcurl offers another option for this, the
CURLOPT_PROXYUSERPWD. It is used quite similar to the CURLOPT_USERPWD
option like this:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "myname:thesecret");
</p><p class="level0">
很长一段时间里,Unix存放FTP用户名和密码的“标准”方式是用$HOME/.netrc文件。
.netrc文件里的账号密码可能是明文存放的,所以这个文件通常被设置为私有,只有该用户能取(参考“安全考虑”这节)。
libcurl能够使用该文件找出对应某个主机的FTP账号和密码。作为一项扩展功能,libcurl也支持非FTP的协议使用这个文件,比如HTTP。
让libcurl使用该文件,可以设置CURLOPT_NETRC选项。
There's a long time UNIX "standard" way of storing
ftp user names and passwords, namely in the $HOME/.netrc file. The file
should be made private so that only the user may read it (see also the
"Security Considerations" chapter), as it might contain the password in
plain text. libcurl has the ability to use this file to figure out what
set of user name and password to use for a particular host. As an
extension to the normal functionality, libcurl also supports this file
for non-FTP protocols such as HTTP. To make curl use this file, use the
CURLOPT_NETRC option:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_NETRC, 1L);
</p><p class="level0">一个最基本的.netrc文件看起来像这样:And a very basic example of how such a .netrc file may look like:
</p><p class="level0"></p><pre><p class="level0"> machine myhost.mydomain.com
login userlogin
password secretword
</p></pre>
<p class="level0">
</p><p class="level0">以上的例子中,密码只是一个可选的配置,或者说,至少能让libcurl尝试在无密码的情况下工作。
但另一些情况下,密码是必需的,比如使用SSL私钥完成安全传输。
All these examples have been cases where the
password has been optional, or at least you could leave it out and have
libcurl attempt to do its job without it. There are times when the
password isn't optional, like when you're using an SSL private key for
secure transfers.
</p><p class="level0">将私钥密码传递给libcurl:To pass the known private key password to libcurl:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_KEYPASSWD, "keypassword");
</p><p class="level0"><a name="HTTP"></a></p><h2 class="nroffsh">HTTP Authentication</h2>
<p class="level0">The previous chapter showed how to set user name and
password for getting URLs that require authentication. When using the
HTTP protocol, there are many different ways a client can provide those
credentials to the server and you can control which way libcurl will
(attempt to) use them. The default HTTP authentication method is called
'Basic', which is sending the name and password in clear-text in the
HTTP request, base64-encoded. This is insecure.
</p><p class="level0">At the time of this writing, libcurl can be built
to use: Basic, Digest, NTLM, Negotiate, GSS-Negotiate and SPNEGO. You
can tell libcurl which one to use with CURLOPT_HTTPAUTH as in:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST);
</p><p class="level0">And when you send authentication to a proxy, you
can also set authentication type the same way but instead with
CURLOPT_PROXYAUTH:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_PROXYAUTH, CURLAUTH_NTLM);
</p><p class="level0">Both these options allow you to set multiple types
(by ORing them together), to make libcurl pick the most secure one out
of the types the server/proxy claims to support. This method does
however add a round-trip since libcurl must first ask the server what it
supports:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST|CURLAUTH_BASIC);
</p><p class="level0">For convenience, you can use the 'CURLAUTH_ANY'
define (instead of a list with specific types) which allows libcurl to
use whatever method it wants.
</p><p class="level0">When asking for multiple types, libcurl will pick the available one it considers "best" in its own internal order of preference.
</p><p class="level0"><a name="HTTP"></a></p><h2 class="nroffsh">HTTP POSTing</h2>
<p class="level0">We get many questions regarding how to issue HTTP
POSTs with libcurl the proper way. This chapter will thus include
examples using both different versions of HTTP POST that libcurl
supports.
</p><p class="level0">The first version is the simple POST, the most
common version, that most HTML pages using the <form> tag uses. We
provide a pointer to the data and tell libcurl to post it all to the
remote site:
</p><p class="level0"></p><pre><p class="level0"> char *data="name=daniel&project=curl";
curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, data);
curl_easy_setopt(easyhandle, CURLOPT_URL, "<a href="http://posthere.com/">http://posthere.com/</a>");
</p><p class="level0"> curl_easy_perform(easyhandle); /* post away! */
</p></pre>
<p class="level0">
</p><p class="level0">Simple enough, huh? Since you set the POST options
with the CURLOPT_POSTFIELDS, this automatically switches the handle to
use POST in the upcoming request.
</p><p class="level0">Ok, so what if you want to post binary data that
also requires you to set the Content-Type: header of the post? Well,
binary posts prevent libcurl from being able to do strlen() on the data
to figure out the size, so therefore we must tell libcurl the size of
the post data. Setting headers in libcurl requests are done in a generic
way, by building a list of our own headers and then passing that list
to libcurl.
</p><p class="level0"></p><pre><p class="level0"> struct curl_slist *headers=NULL;
headers = curl_slist_append(headers, "Content-Type: text/xml");
</p><p class="level0"> /* post binary data */
curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, binaryptr);
</p><p class="level0"> /* set the size of the postfields data */
curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDSIZE, 23L);
</p><p class="level0"> /* pass our list of custom made headers */
curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers);
</p><p class="level0"> curl_easy_perform(easyhandle); /* post away! */
</p><p class="level0"> curl_slist_free_all(headers); /* free the header list */
</p></pre>
<p class="level0">
</p><p class="level0">While the simple examples above cover the majority
of all cases where HTTP POST operations are required, they don't do
multi-part formposts. Multi-part formposts were introduced as a better
way to post (possibly large) binary data and were first documented in
the RFC1867 (updated in RFC2388). They're called multi-part because
they're built by a chain of parts, each part being a single unit of
data. Each part has its own name and contents. You can in fact create
and post a multi-part formpost with the regular libcurl POST support
described above, but that would require that you build a formpost
yourself and provide to libcurl. To make that easier, libcurl provides <a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_formadd.html">curl_formadd(3)</a>. Using this function, you add parts to the form. When you're done adding parts, you post the whole form.
</p><p class="level0">The following example sets two simple text parts
with plain textual contents, and then a file with binary contents and
uploads the whole thing.
</p><p class="level0"></p><pre><p class="level0"> struct curl_httppost *post=NULL;
struct curl_httppost *last=NULL;
curl_formadd(&post, &last,
CURLFORM_COPYNAME, "name",
CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END);
curl_formadd(&post, &last,
CURLFORM_COPYNAME, "project",
CURLFORM_COPYCONTENTS, "curl", CURLFORM_END);
curl_formadd(&post, &last,
CURLFORM_COPYNAME, "logotype-image",
CURLFORM_FILECONTENT, "curl.png", CURLFORM_END);
</p><p class="level0"> /* Set the form info */
curl_easy_setopt(easyhandle, CURLOPT_HTTPPOST, post);
</p><p class="level0"> curl_easy_perform(easyhandle); /* post away! */
</p><p class="level0"> /* free the post data again */
curl_formfree(post);
</p></pre>
<p class="level0">
</p><p class="level0">Multipart formposts are chains of parts using
MIME-style separators and headers. It means that each one of these
separate parts get a few headers set that describe the individual
content-type, size etc. To enable your application to handicraft this
formpost even more, libcurl allows you to supply your own set of custom
headers to such an individual form part. You can of course supply
headers to as many parts as you like, but this little example will show
how you set headers to one specific part when you add that to the post
handle:
</p><p class="level0"></p><pre><p class="level0"> struct curl_slist *headers=NULL;
headers = curl_slist_append(headers, "Content-Type: text/xml");
</p><p class="level0"> curl_formadd(&post, &last,
CURLFORM_COPYNAME, "logotype-image",
CURLFORM_FILECONTENT, "curl.xml",
CURLFORM_CONTENTHEADER, headers,
CURLFORM_END);
</p><p class="level0"> curl_easy_perform(easyhandle); /* post away! */
</p><p class="level0"> curl_formfree(post); /* free post */
curl_slist_free_all(headers); /* free custom header list */
</p></pre>
<p class="level0">
</p><p class="level0">Since all options on an easyhandle are "sticky", they remain the same until changed even if you do call <a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_perform.html">curl_easy_perform(3)</a>,
you may need to tell curl to go back to a plain GET request if you
intend to do one as your next request. You force an easyhandle to go
back to GET by using the CURLOPT_HTTPGET option:
</p><p class="level0"> curl_easy_setopt(easyhandle, CURLOPT_HTTPGET, 1L);
</p><p class="level0">Just setting CURLOPT_POSTFIELDS to "" or NULL will
*not* stop libcurl from doing a POST. It will just make it POST without
any data to send!
</p><p class="level0"><a name="Showing"></a></p><h2 class="nroffsh">Showing Progress</h2>
<p class="level0">
</p><p class="level0">For historical and traditional reasons, libcurl
has a built-in progress meter that can be switched on and then makes it
present a progress meter in your terminal.
</p><p class="level0">Switch on the progress meter by, oddly enough, setting CURLOPT_NOPROGRESS to zero. This option is set to 1 by default.
</p><p class="level0">For most applications however, the built-in
progress meter is useless and what instead is interesting is the ability
to specify a progress callback. The function pointer you pass to
libcurl will then be called on irregular intervals with information
about the current transfer.
</p><p class="level0">Set the progress callback by using CURLOPT_PROGRESSFUNCTION. And pass a pointer to a function that matches this prototype:
</p><p class="level0"></p><pre><p class="level0"> int progress_callback(void *clientp,
double dltotal,
double dlnow,
double ultotal,
double ulnow);
</p></pre>
<p class="level0">
</p><p class="level0">If any of the input arguments is unknown, a 0 will
be passed. The first argument, the 'clientp' is the pointer you pass to
libcurl with CURLOPT_PROGRESSDATA. libcurl won't touch it.
</p><p class="level0"><a name="libcurl"></a></p><h2 class="nroffsh">libcurl with C++</h2>
<p class="level0">
</p><p class="level0">There's basically only one thing to keep in mind when using C++ instead of C when interfacing libcurl:
</p><p class="level0">The callbacks CANNOT be non-static class member functions
</p><p class="level0">Example C++ code:
</p><p class="level0"></p><pre><p class="level0">class AClass {
static size_t write_data(void *ptr, size_t size, size_t nmemb,
void *ourpointer)
{
/* do what you want with the data */
}
}
</p></pre>
<p class="level0">
</p><p class="level0"><a name="Proxies"></a></p><h2 class="nroffsh">Proxies</h2>
<p class="level0">
</p><p class="level0">What "proxy" means according to Merriam-Webster:
"a person authorized to act for another" but also "the agency, function,
or office of a deputy who acts as a substitute for another".
</p><p class="level0">Proxies are exceedingly common these days.
Companies often only offer Internet access to employees through their
proxies. Network clients or user-agents ask the proxy for documents, the
proxy does the actual request and then it returns them.
</p><p class="level0">libcurl supports SOCKS and HTTP proxies. When a
given URL is wanted, libcurl will ask the proxy for it instead of trying
to connect to the actual host identified in the URL.
</p><p class="level0">If you're using a SOCKS proxy, you may find that libcurl doesn't quite support all operations through it.
</p><p class="level0">For HTTP proxies: the fact that the proxy is a
HTTP proxy puts certain restrictions on what can actually happen. A
requested URL that might not be a HTTP URL will be still be passed to
the HTTP proxy to deliver back to libcurl. This happens transparently,
and an application may not need to know. I say "may", because at times
it is very important to understand that all operations over a HTTP proxy
use the HTTP protocol. For example, you can't invoke your own custom
FTP commands or even proper FTP directory listings.
</p><p class="level0">
</p><p class="level0"><a name="Proxy"></a><span class="nroffip">Proxy Options</span>
</p><p class="level1">
</p><p class="level1">To tell libcurl to use a proxy at a given port number:
</p><p class="level1"> curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-host.com:8080");
</p><p class="level1">Some proxies require user authentication before allowing a request, and you pass that information similar to this:
</p><p class="level1"> curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:password");
</p><p class="level1">If you want to, you can specify the host name only
in the CURLOPT_PROXY option, and set the port number separately with
CURLOPT_PROXYPORT.
</p><p class="level1">Tell libcurl what kind of proxy it is with CURLOPT_PROXYTYPE (if not, it will default to assume a HTTP proxy):
</p><p class="level1"> curl_easy_setopt(easyhandle, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4);
</p><p class="level1">
</p><p class="level0"><a name="Environment"></a><span class="nroffip">Environment Variables</span>
</p><p class="level1">
</p><p class="level1">libcurl automatically checks and uses a set of
environment variables to know what proxies to use for certain protocols.
The names of the variables are following an ancient de facto standard
and are built up as "[protocol]_proxy" (note the lower casing). Which
makes the variable 'http_proxy' checked for a name of a proxy to use
when the input URL is HTTP. Following the same rule, the variable named
'ftp_proxy' is checked for FTP URLs. Again, the proxies are always HTTP
proxies, the different names of the variables simply allows different
HTTP proxies to be used.
</p><p class="level1">The proxy environment variable contents should be
in the format "[protocol://][user:password@]machine[:port]". Where the
protocol:// part is simply ignored if present (so <a href="http://proxy/">http://proxy</a>
and bluerk://proxy will do the same) and the optional port number
specifies on which port the proxy operates on the host. If not
specified, the internal default port number will be used and that is
most likely *not* the one you would like it to be.
</p><p class="level1">There are two special environment variables.
'all_proxy' is what sets proxy for any URL in case the protocol specific
variable wasn't set, and 'no_proxy' defines a list of hosts that should
not use a proxy even though a variable may say so. If 'no_proxy' is a
plain asterisk ("*") it matches all hosts.
</p><p class="level1">To explicitly disable libcurl's checking for and
using the proxy environment variables, set the proxy name to "" - an
empty string - with CURLOPT_PROXY.
</p><p class="level0"><a name="SSL"></a><span class="nroffip">SSL and Proxies</span>
</p><p class="level1">
</p><p class="level1">SSL is for secure point-to-point connections. This
involves strong encryption and similar things, which effectively makes
it impossible for a proxy to operate as a "man in between" which the
proxy's task is, as previously discussed. Instead, the only way to have
SSL work over a HTTP proxy is to ask the proxy to tunnel trough
everything without being able to check or fiddle with the traffic.
</p><p class="level1">Opening an SSL connection over a HTTP proxy is
therefor a matter of asking the proxy for a straight connection to the
target host on a specified port. This is made with the HTTP request
CONNECT. ("please mr proxy, connect me to that remote host").
</p><p class="level1">Because of the nature of this operation, where the
proxy has no idea what kind of data that is passed in and out through
this tunnel, this breaks some of the very few advantages that come from
using a proxy, such as caching. Many organizations prevent this kind of
tunneling to other destination port numbers than 443 (which is the
default HTTPS port number).
</p><p class="level1">
</p><p class="level0"><a name="Tunneling"></a><span class="nroffip">Tunneling Through Proxy</span>
</p><p class="level1">As explained above, tunneling is required for SSL to work and often even restricted to the operation intended for SSL; HTTPS.
</p><p class="level1">This is however not the only time proxy-tunneling might offer benefits to you or your application.
</p><p class="level1">As tunneling opens a direct connection from your
application to the remote machine, it suddenly also re-introduces the
ability to do non-HTTP operations over a HTTP proxy. You can in fact use
things such as FTP upload or FTP custom commands this way.
</p><p class="level1">Again, this is often prevented by the administrators of proxies and is rarely allowed.
</p><p class="level1">Tell libcurl to use proxy tunneling like this:
</p><p class="level1"> curl_easy_setopt(easyhandle, CURLOPT_HTTPPROXYTUNNEL, 1L);
</p><p class="level1">In fact, there might even be times when you want
to do plain HTTP operations using a tunnel like this, as it then enables
you to operate on the remote server instead of asking the proxy to do
so. libcurl will not stand in the way for such innovative actions
either!
</p><p class="level1">
</p><p class="level0"><a name="Proxy"></a><span class="nroffip">Proxy Auto-Config</span>
</p><p class="level1">
</p><p class="level1">Netscape first came up with this. It is basically a
web page (usually using a .pac extension) with a Javascript that when
executed by the browser with the requested URL as input, returns
information to the browser on how to connect to the URL. The returned
information might be "DIRECT" (which means no proxy should be used),
"PROXY host:port" (to tell the browser where the proxy for this
particular URL is) or "SOCKS host:port" (to direct the browser to a
SOCKS proxy).
</p><p class="level1">libcurl has no means to interpret or evaluate
Javascript and thus it doesn't support this. If you get yourself in a
position where you face this nasty invention, the following advice have
been mentioned and used in the past:
</p><p class="level1">- Depending on the Javascript complexity, write up a script that translates it to another language and execute that.
</p><p class="level1">- Read the Javascript code and rewrite the same logic in another language.
</p><p class="level1">- Implement a Javascript interpreter; people have successfully used the Mozilla Javascript engine in the past.
</p><p class="level1">- Ask your admins to stop this, for a static proxy setup or similar.
</p><p class="level1"><a name="Persistence"></a></p><h2 class="nroffsh">Persistence Is The Way to Happiness</h2>
<p class="level0">
</p><p class="level0">Re-cycling the same easy handle several times when doing multiple requests is the way to go.
</p><p class="level0">After each single <a class="emphasis" href="http://curl.haxx.se/libcurl/c/curl_easy_perform.html">curl_easy_perform(3)</a>
operation, libcurl will keep the connection alive and open. A
subsequent request using the same easy handle to the same host might
just be able to use the already open connection! This reduces network
impact a lot.
</p><p class="level0">Even if the connection is dropped, all connections
involving SSL to the same host again, will benefit from libcurl's
session ID cache that drastically reduces re-connection time.
</p><p class="level0">FTP connections that are kept alive save a lot of
time, as the command- response round-trips are skipped, and also you
don't risk getting blocked without permission to login again like on
many FTP servers only allowing N persons to be logged in at the same
time.
</p><p class="level0">libcurl caches DNS name resolving results, to make lookups of a previously looked up name a lot faster.
</p><p class="level0">Other interesting details that improve performance for subsequent requests may also be added in the future.
</p><p class="level0">Each easy handle will attempt to keep the last few
connections alive for a while in case they are to be used again. You
can set the size of this "cache" with the CURLOPT_MAXCONNECTS option.
Default is 5. There is very seldom any point in changing this value, and
if you think of changing this it is often just a matter of thinking
again.
</p><p class="level0">To force your upcoming request to not use an
already existing connection (it will even close one first if there
happens to be one alive to the same host you're about to operate on),
you can do that by setting CURLOPT_FRESH_CONNECT to 1. In a similar
spirit, you can also forbid the upcoming request to be "lying" around
and possibly get re-used after the request by setting
CURLOPT_FORBID_REUSE to 1.
</p><p class="level0"><a name="HTTP"></a></p><h2 class="nroffsh">HTTP Headers Used by libcurl</h2>
<p class="level0">When you use libcurl to do HTTP requests, it'll pass
along a series of headers automatically. It might be good for you to
know and understand these. You can replace or remove them by using the
CURLOPT_HTTPHEADER option.
</p><p class="level0">
</p><p class="level0"><a name="Host"></a><span class="nroffip">Host</span>
</p><p class="level1">This header is required by HTTP 1.1 and even many
1.0 servers and should be the name of the server we want to talk to.
This includes the port number if anything but default.
</p><p class="level1">
</p><p class="level0"><a name="Pragma"></a><span class="nroffip">Pragma</span>
</p><p class="level1">"no-cache". Tells a possible proxy to not grab a copy from the cache but to fetch a fresh one.
</p><p class="level1">
</p><p class="level0"><a name="Accept"></a><span class="nroffip">Accept</span>
</p><p class="level1">"*/*".
</p><p class="level1">
</p><p class="level0"><a name="Expect"></a><span class="nroffip">Expect</span>
</p><p class="level1">When doing POST requests, libcurl sets this header
to "100-continue" to ask the server for an "OK" message before it
proceeds with sending the data part of the post. If the POSTed data
amount is deemed "small", libcurl will not use this header.
</p><p class="level1"><a name="Customizing"></a></p><h2 class="nroffsh">Customizing Operations</h2>
<p class="level0">There is an ongoing development today where more and
more protocols are built upon HTTP for transport. This has obvious
benefits as HTTP is a tested and reliable protocol that is widely
deployed and has excellent proxy-support.
</p><p class="level0">When you use one of these protocols, and even when
doing other kinds of programming you may need to change the traditional
HTTP (or FTP or...) manners. You may need to change words, headers or